PulseAugur
EN
LIVE 07:30:23
ENTITY English

English

PulseAugur coverage of English — every cluster mentioning English across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
103
103 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
89
89 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
SENTIMENT · 30D

18 day(s) with sentiment data

RECENT · PAGE 1/6 · 103 TOTAL
  1. TOOL · CL_111729 ·

    New neural diarization model excels on low-resource Nepali-Hindi speech

    Researchers have developed a new approach to speaker diarization, the process of identifying who spoke when in an audio recording, specifically for low-resource languages like Nepali-Hindi. They trained two neural netwo…

  2. RESEARCH · CL_111601 ·

    New framework induces hierarchies from diverse text sources

    Researchers have developed a new term-centric framework for creating interpretable hierarchical taxonomies from diverse text sources. This method uses automatic term extraction to map documents into a shared representat…

  3. TOOL · CL_109892 ·

    LLMs match and exceed human examiner agreement on UK GCSE exams

    A new dataset of 32,534 double-marked real student responses to UK GCSE mock exams has been introduced, covering 328 questions across five subjects, including handwritten work. Researchers found that current large langu…

  4. RESEARCH · CL_111610 ·

    New SOLAR method enhances cross-lingual reasoning in LLMs

    Researchers have developed SOLAR, a new method to improve cross-lingual reasoning in large language models. This technique aligns soft-token representations across languages, using English as a pivot to create more lang…

  5. RESEARCH · CL_109559 ·

    Readers still prefer human translations over AI-generated literary texts

    A new study published on arXiv reveals that while AI-generated translations of literary texts are considered "fine" by readers, human translations are still preferred for their immersive quality and clarity. The researc…

  6. RESEARCH · CL_109518 ·

    HIPE-2026 evaluates person-place relation extraction from historical texts · 3 sources tracked

    The HIPE-2026 evaluation campaign focused on extracting person-place relationships from multilingual historical texts, building upon previous HIPE editions that concentrated on named entity recognition. This year's chal…

  7. RESEARCH · CL_109547 ·

    New Red Teaming Framework Exposes LLM Faithfulness Vulnerabilities

    Researchers have developed a novel red teaming framework to systematically uncover vulnerabilities in large language models (LLMs). This framework utilizes a multi-role architecture with target, attacker, and jury model…

  8. RESEARCH · CL_109568 ·

    New neural architecture advances phoneme alignment beyond traditional methods

    Researchers have developed a novel, fully differentiable neural architecture for phoneme alignment, aiming to advance the field beyond traditional HMM-GMM frameworks. This end-to-end system features an encoder for signa…

  9. TOOL · CL_108067 ·

    Study finds function vectors in LLMs are largely language-agnostic for translation

    Researchers have investigated whether function vectors (FVs), which represent tasks extracted from model activations during in-context learning, are language-agnostic. Using machine translation as a case study across th…

  10. RESEARCH · CL_109575 ·

    New Japanese TTS system tackles kanji polyphony with massive data scaling

    Researchers have developed Sarashina2.2-TTS, a novel text-to-speech system specifically designed for Japanese, addressing the challenge of kanji polyphony. The system utilizes a massive dataset of approximately 361,000 …

  11. RESEARCH · CL_109576 ·

    New AI models tackle low-resource Tangkhul-English translation

    Researchers have developed two neural machine translation systems for the low-resource Tangkhul-English language pair. The primary system, utilizing ByT5-large fine-tuned on over 38,000 parallel sentences, achieved a BL…

  12. TOOL · CL_107534 ·

    AssemblyAI launches Medical Mode with native code-switching transcription

    AssemblyAI has introduced a new Medical Mode for its transcription models, focusing on accurate handling of code-switching within clinical conversations. Unlike systems that require language toggles, AssemblyAI's Univer…

  13. RESEARCH · CL_107116 ·

    Data scale, not latency, dictates cross-lingual speech recognition transfer

    A new study indicates that the scale of training data, rather than latency, is the primary factor influencing the effectiveness of cross-lingual transfer in streaming speech recognition models. Researchers found that wh…

  14. RESEARCH · CL_107785 ·

    New Marathi POS Tagging Dataset and BERT Models Released

    Researchers have introduced L3Cube-MahaPOS, a new dataset for Marathi Part-of-Speech (POS) tagging, addressing the scarcity of annotated resources for the language. The dataset contains over 32,000 manually annotated se…

  15. RESEARCH · CL_107768 ·

    African languages face significant tokenization penalty in frontier LLMs

    A new research paper reveals a significant "African Language Tax" in frontier large language models, where tokenizers assign substantially more subword tokens to African languages compared to English. This results in hi…

  16. TOOL · CL_105129 ·

    New benchmark measures LLM over-alignment in criminal law

    A new benchmark, TF-RefusalBench, has been developed to measure and mitigate over-alignment in large language models (LLMs) used within multilingual criminal law contexts. The benchmark, comprising 5,200 prompts across …

  17. RESEARCH · CL_105005 ·

    LLMs rely on third-party sites like Wikipedia for brand info, study finds · 4 sources tracked

    A new study reveals that large language models (LLMs) primarily rely on third-party sources, such as Wikipedia and YouTube, to generate information about brands. Research indicates that Wikipedia is the most cited domai…

  18. TOOL · CL_105165 ·

    Study compares DeepL, eTranslation, Systran MT systems for specialized French translation

    A new study evaluates the performance of three machine translation (MT) systems—DeepL, eTranslation, and Systran—in translating specialized English content into French. The research also compared the post-editing effort…

  19. MEME · CL_102473 ·

    Reddit discusses training LLMs to think in optimized AI languages

    A discussion on Reddit explores the concept of training Large Language Models (LLMs) to think in an optimized, non-human language instead of English. The user posits that such an approach could potentially allow AIs to …

  20. TOOL · CL_104724 ·

    LLMs struggle with Hausa and Fongbe translation, metrics unreliable

    A new study evaluated the machine translation capabilities of four large language models (LLMs) for Hausa and Fongbe, two West African languages. The research found that while Hausa achieved acceptable translation quali…