English(EN) From Tokens to Faces: Investigating Discrete Speech Representations for 3D Facial Animation

语音表示影响3D面部动画质量

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-11 17:41

研究人员探讨了不同的语音表示如何影响3D面部动画的质量。该研究比较了四类语音表示，并使用客观和感知测量方法，通过两个面部解码器评估了它们的有效性。研究结果表明，在语音表示中编码语音类别可以更准确地预测面部动画。 AI

影响这项研究通过优化语音数据的使用，有望实现更真实、更准确的AI驱动的面部动画系统。

排序理由该集群包含一篇在arXiv上发表的研究论文，详细介绍了对用于3D面部动画的语音表示的研究。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Pedro Correa, Olivier Perrotin, Samir Sadok, Paula Costa, Thomas Hueber · 2026-06-12 04:00

From Tokens to Faces: Investigating Discrete Speech Representations for 3D Facial Animation

arXiv:2606.13630v1 Announce Type: new Abstract: The choice of speech representation is critical in speech-driven 3D facial animation. Representations differ in what they encode: SSL features emphasize segmental and semantic cues, neural codecs yield latents optimized for acoustic…
arXiv cs.CL TIER_1 English(EN) · Thomas Hueber · 2026-06-11 17:41

From Tokens to Faces: Investigating Discrete Speech Representations for 3D Facial Animation

The choice of speech representation is critical in speech-driven 3D facial animation. Representations differ in what they encode: SSL features emphasize segmental and semantic cues, neural codecs yield latents optimized for acoustic reconstruction, and ASR-style objectives produc…