English(EN) Phoneme-Level Deepfake Detection Across Emotional Conditions Using Self-Supervised Embeddings

音素级分析提高了对情绪操纵的合成语音的检测能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-06 04:00

研究人员开发了一种通过分析音素级别的语音来检测深度伪造音频的新方法。这种使用自监督嵌入的方法被证明比以前将语音视为统一信号的旧方法更有效。研究发现，某些音素，特别是复杂的元音和摩擦音，在合成语音中表现出更大的差异，这使得它们成为在各种情绪和合成系统中识别操纵音频的关键指标。 AI

影响音素级分析为检测复杂的音频深度伪造提供了一种更具可解释性和有效性的方法。

排序理由关于检测音频深度伪造新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Vamshi Nallaguntla, Shruti Kshirsagar, Anderson R. Avila · 2026-05-06 04:00

使用自监督嵌入在音素级别进行跨情感条件的深度伪造检测

arXiv:2605.03079v1 Announce Type: cross Abstract: Recent advances in emotional voice conversion (EVC) have enabled the generation of expressive synthetic speech, raising new concerns in audio deepfake detection. Existing approaches treat speech as a homogeneous signal and largely…