English(EN) Fully Differentiable Neural Forced Alignment via Soft Dynamic Programming

新型神经网络架构推动音素对齐超越传统方法

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-24 06:42

研究人员开发了一种新颖的全微分神经网络架构用于音素对齐，旨在推动该领域超越传统的HMM-GMM框架。该新模型包含一个编码器，具有用于音素识别和边界检测的独立分支，并结合使用微分软动态规划的解码器。该系统通过对比损失进行优化，在英语音素对齐基准测试中表现出卓越的性能，并显示出在未见过语言上的泛化能力。 AI

影响这项研究通过改进音素对齐技术，可能带来更准确、更鲁棒的语音识别系统。

排序理由该集群包含一篇详细介绍语音处理新研究方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Joseph Keshet · 2026-06-24 06:42

Fully Differentiable Neural Forced Alignment via Soft Dynamic Programming

Recent advances in sequence modeling have significantly improved ASR systems, bringing them close to human-level recognition accuracy and enhancing robustness across diverse acoustic conditions and languages. In contrast, Forced Alignment has not experienced comparable progress, …

报道来源 [1]

Fully Differentiable Neural Forced Alignment via Soft Dynamic Programming

相关实体

相关话题