PulseAugur
EN
LIVE 15:29:42

ASR models evaluated on Dutch child speech, Whisper-medium leads

A new study published on arXiv evaluates the performance of nine state-of-the-art Automatic Speech Recognition (ASR) models, including Whisper, Parakeet, and Wav2Vec2, on Dutch child speech datasets. The fine-tuned Whisper-medium model demonstrated the best overall performance, achieving a Word Error Rate (WER) of 5.54% on the JASMIN dataset and 70.37% on the more challenging DART dataset. Researchers also developed a method to automatically identify correctly pronounced utterances with high confidence, reducing the need for manual verification and enabling automatic transcription for a significant portion of the data. AI

IMPACT This research could improve the accuracy and efficiency of transcribing children's speech for linguistic studies.

RANK_REASON This is a research paper detailing the performance of ASR models on a specific type of speech data. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

ASR models evaluated on Dutch child speech, Whisper-medium leads

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Gus Lathouwers, Lingyun Gao, Catia Cucchiarini, Helmer Strik ·

    Transcribing Children's Speech: ASR Performance and Obtaining Reliable Orthographic Transcriptions

    arXiv:2605.28833v1 Announce Type: cross Abstract: Automatic speech recognition (ASR) has the potential to substantially reduce manual annotation effort in child speech research by generating automatic transcriptions. However, obtaining reliably high-quality ASR transcriptions for…