PulseAugur
EN
LIVE 14:44:00

New acoustic models achieve SOTA on TIMIT phonetic recognition

Researchers have analyzed the error patterns of raw waveform acoustic models used for phonetic recognition on the TIMIT dataset. They decomposed the phone error rate (PER) across phonetic categories and constructed confusion matrices to understand substitution errors. The study found that their models achieved state-of-the-art results for raw waveform systems on TIMIT, and transfer learning from WSJ further improved performance, particularly for consonants. AI

IMPACT This research offers a deeper understanding of phonetic error patterns, potentially leading to more accurate speech recognition systems.

RANK_REASON The cluster contains an academic paper detailing new research findings and model performance.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.AI TIER_1 English(EN) · Erfan Loweimi, Zhengjun Yue, Andrea Carmantini, Zoran Cvetkovic, Steve Renals, Peter Bell ·

    Phonetic Error Analysis of Raw Waveform Acoustic Models

    arXiv:2606.07030v1 Announce Type: cross Abstract: We analyse error patterns of raw waveform acoustic models on TIMIT phone recognition beyond the overall phone error rate (PER). PER is decomposed across three broad phonetic class (BPC) categorisations, and confusion matrices are …

  2. arXiv cs.CL TIER_1 English(EN) · Peter Bell ·

    Phonetic Error Analysis of Raw Waveform Acoustic Models

    We analyse error patterns of raw waveform acoustic models on TIMIT phone recognition beyond the overall phone error rate (PER). PER is decomposed across three broad phonetic class (BPC) categorisations, and confusion matrices are constructed from substitution errors. Our models c…