Researchers have developed a novel system to improve speaker verification accuracy when dealing with whispered speech. The proposed model utilizes an encoder-decoder structure, fine-tuned on a speaker verification backbone, and is optimized with both cosine similarity classification and triplet loss. This approach achieved a relative improvement of 22.26% in distinguishing normal from whispered speech and an EER of 1.88% in whispered-to-whispered comparisons, outperforming the previous leading model, ReDimNet-B2. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Enhances the robustness of voice identification systems, potentially improving security and privacy applications.
RANK_REASON This is a research paper published on arXiv detailing a new model for speaker verification. [lever_c_demoted from research: ic=1 ai=1.0]