PulseAugur
EN
LIVE 18:21:02

New benchmark targets real-world Indian speech recognition

Researchers have introduced "Voice of India," a new benchmark designed to improve automatic speech recognition (ASR) for 15 major Indian languages. Unlike previous benchmarks that used scripted speech, this dataset comprises unscripted telephonic conversations from over 36,000 speakers, totaling 536 hours. The benchmark accounts for spelling variations common in Indian languages and analyzes ASR performance geographically, revealing disparities across regions and factors like audio quality and device type. AI

IMPACT Addresses limitations in current ASR systems for Indian languages, potentially improving accessibility and usability of voice technologies across diverse regions.

RANK_REASON The cluster contains an academic paper introducing a new benchmark dataset. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Kaushal Bhogale, Manas Dhir, Amritansh Walecha, Manmeet Kaur, Vanshika Chhabra, Aaditya Pareek, Hanuman Sidh, Mahima Manik, Sagar Jain, Bhaskar Singh, Utkarsh Singh, Tahir Javed, Shobhit Banga, Mitesh M. Khapra ·

    Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India

    arXiv:2604.19151v2 Announce Type: replace Abstract: Existing Indic ASR benchmarks often use scripted, clean speech and leaderboard driven evaluation that encourages dataset specific overfitting. In addition, strict single reference WER penalizes natural spelling variation in Indi…