PulseAugur
EN
LIVE 07:38:50

Research reveals bias in IPA-based speech recognition systems

A new research paper analyzes demographic biases in phoneme-based Automatic Speech Recognition (ASR) systems, specifically those generating International Phonetic Alphabet (IPA) transcriptions. The study evaluates two open-source systems, WhisperIPA and ZIPA, using diverse speech corpora and demographically annotated English data. Findings indicate persistent performance disparities across various demographic groups, including gender, accent, ethnicity, and age, even when accounting for linguistically similar phoneme substitutions. AI

IMPACT Highlights potential biases in IPA transcription models, informing the development of more inclusive and robust phoneme-based ASR systems.

RANK_REASON The cluster contains a research paper analyzing bias in ASR systems.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Research reveals bias in IPA-based speech recognition systems

COVERAGE [3]

  1. arXiv cs.CL TIER_1 English(EN) · Catherine Bao, Maneesha Rani Saha, Neal Patwari ·

    Evaluating Bias in Phoneme-Based Automatic Speech Recognition Systems: An Analysis of IPA Transcription Models

    arXiv:2606.11639v1 Announce Type: new Abstract: The popularization of automatic speech recognition (ASR) systems has increased exploration of the demographic biases related to race, age, gender, and accent, often formed from imbalanced training data. Most of these studies focused…

  2. arXiv cs.CL TIER_1 English(EN) · Neal Patwari ·

    Evaluating Bias in Phoneme-Based Automatic Speech Recognition Systems: An Analysis of IPA Transcription Models

    The popularization of automatic speech recognition (ASR) systems has increased exploration of the demographic biases related to race, age, gender, and accent, often formed from imbalanced training data. Most of these studies focused on standard grapheme-based ASR systems with com…

  3. r/LocalLLaMA TIER_1 English(EN) · /u/matt8p ·

    How I implemented ASR bias for voice transcription models [Open Source]

    <table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1u2vr8g/how_i_implemented_asr_bias_for_voice/"> <img alt="How I implemented ASR bias for voice transcription models [Open Source]" src="https://external-preview.redd.it/YTVhd213MnZ3bTZoMaDUjCJxRGoiNjjmNeUNS4PT…