PulseAugur
EN
LIVE 15:27:45

Wav2Vec 2.0 model interpretability for pathological speech assessment studied

Researchers have investigated the interpretability of a Wav2Vec 2.0 model used for assessing pathological speech in oral and oropharyngeal cancer patients. Using canonical correlation analysis, they measured the correlation between the model's embeddings and acoustic features. The study found that the model's learned representations are most strongly associated with spectral and prosodic features, with the first Mel Frequency Cepstral Coefficient showing the highest correlations across all layers. This research not only aids in understanding how speech assessment models encode acoustic information but also provides practical insights for selecting acoustic features in pathological speech analysis. AI

IMPACT Provides insights into how speech assessment models process acoustic data, potentially improving pathological speech analysis.

RANK_REASON Academic paper detailing a case study on model interpretability. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.NE (Neural & Evolutionary) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Wav2Vec 2.0 model interpretability for pathological speech assessment studied

COVERAGE [1]

  1. arXiv cs.NE (Neural & Evolutionary) TIER_1 English(EN) · Virginie Woisard ·

    What Does a Pathological Speech Assessment Model Know about Acoustic Features? A Case Study on Oral and Oropharyngeal Cancer Patients

    This work investigates the interpretability of a Wav2Vec 2.0based speech intelligibility assessment model for oral and oropharyngeal cancer patients through canonical correlation analysis. By measuring the correlation between the model embeddings and eGeMAPS low-level descriptors…