Systematic Study of Dysarthric Speech Recognition: Spectral Features and Acoustic Models
Two new research papers explore methods to improve automatic speech recognition (ASR) for individuals with dysarthria, a speech disorder often caused by neurological conditions. The first paper systematically studies spectral features and acoustic models, finding that incorporating pitch features and using the Factorized Time Delay Neural Network (F-TDNN) model can lead to significant relative improvements in word and sentence recognition. The second paper focuses on data augmentation techniques, specifically Speaking-Rate Modification (SRM) and Pitch Modification (PM), applied to the Wav2Vec2 model, demonstrating that these methods can effectively enhance ASR performance across different severity levels of dysarthria. AI
IMPACT These advancements could significantly improve communication tools and accessibility for individuals with speech impairments.