Researchers have developed a new framework for evaluating Text-to-Speech (TTS) systems, particularly for Indian languages. This framework uses crowdsourced pairwise comparisons across six perceptual dimensions: intelligibility, expressiveness, voice quality, liveliness, noise, and hallucinations. The study involved over 1900 native raters providing more than 120,000 comparisons for 7 state-of-the-art TTS systems using over 5,000 sentences in 10 Indian languages. The results provide a multilingual leaderboard and analyze model trade-offs. AI
IMPACT Establishes a new benchmark for evaluating TTS quality, particularly for underrepresented languages, potentially driving improvements in multilingual voice synthesis.
RANK_REASON Academic paper detailing a new evaluation methodology for TTS systems. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →