PulseAugur
EN
LIVE 10:23:26

New ASR techniques tackle phonetic errors and judge reliability

Researchers are developing advanced methods to improve Automatic Speech Recognition (ASR) systems, particularly for low-resource languages and to address specific types of errors. One approach, Error-Aware TF-IDF, uses a novel algorithm to prioritize corrective documents based on historical phonetic misrecognitions, significantly reducing word error rates. Another method, G-SPIN, combines phonetic graph modeling with large language models to correct semantically critical errors by restricting the search space to plausible phonetic alternatives. Additionally, a study questions the reliability of automated judges used to score LLM jailbreak attempts, revealing inconsistencies and vulnerabilities in their accuracy and robustness. AI

IMPACT Advances in ASR error correction could improve voice interfaces and transcription services, while scrutiny of LLM evaluation methods highlights the need for more robust safety testing.

RANK_REASON Multiple research papers published on arXiv detailing novel methods for ASR error correction and evaluating LLM safety judges.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 7 sources. How we write summaries →

New ASR techniques tackle phonetic errors and judge reliability

COVERAGE [7]

  1. arXiv cs.CL TIER_1 English(EN) · Mohammad Aref Jafari-Raddani ·

    Error-Aware TF-IDF Retrieval-Augmented Generation for ASR Error Correction

    arXiv:2606.24915v1 Announce Type: new Abstract: End-to-end automatic speech recognition systems frequently hallucinate rare entities and domain-specific terms, especially in low-resource languages. While retrieval-augmented generation frameworks can mitigate these errors using la…

  2. arXiv cs.CL TIER_1 English(EN) · Pratik Rakesh Singh, Mohammadi Zaki, Aneesh Mukkamala, Pankaj Wasnik ·

    Graph-Based Phonetic Error Correction of Noisy ASR

    arXiv:2606.24889v1 Announce Type: new Abstract: Automatic speech recognition (ASR) systems, despite low overall word error rates, produce residual lexical errors that disproportionately affect semantically critical tokens such as named entities, negations, and sentiment-bearing w…

  3. arXiv cs.CL TIER_1 English(EN) · Yang Gao (Veyon Solutions) ·

    How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring

    arXiv:2606.25487v1 Announce Type: new Abstract: Almost every paper on LLM jailbreaks and prompt injection reports an attack-success rate (ASR), and that number is assigned not by people but by an automated judge: either a safety classifier trained for the task, or a general chat …

  4. Hugging Face Daily Papers TIER_1 English(EN) ·

    How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring

    Almost every paper on LLM jailbreaks and prompt injection reports an attack-success rate (ASR), and that number is assigned not by people but by an automated judge: either a safety classifier trained for the task, or a general chat model prompted to grade. The judge is rarely che…

  5. arXiv cs.CL TIER_1 English(EN) · Yang Gao ·

    How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scoring

    Almost every paper on LLM jailbreaks and prompt injection reports an attack-success rate (ASR), and that number is assigned not by people but by an automated judge: either a safety classifier trained for the task, or a general chat model prompted to grade. The judge is rarely che…

  6. arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Mohammad Aref Jafari-Raddani ·

    Error-Aware TF-IDF Retrieval-Augmented Generation for ASR Error Correction

    End-to-end automatic speech recognition systems frequently hallucinate rare entities and domain-specific terms, especially in low-resource languages. While retrieval-augmented generation frameworks can mitigate these errors using large language models, current architectures face …

  7. Towards AI TIER_1 English(EN) · Dmitriy Nikultsev ·

    Why Word Error Rate Is Not Enough: Semantic Decomposition of ASR Errors

    <h4>A feasible framework for evaluating ASR models across semantic categories instead of a single aggregate metric</h4><figure><img alt="Introduction image showing decomposition of general WER into semantic categories, such as people, geography names, etc" src="https://cdn-images…