English(EN) UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

新的UMA-Split模型增强了英文和中文语音识别能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-18 04:00

研究人员开发了UMA-Split，这是一种新颖的非自回归模型，专为英文和中文语音识别而设计。该模型解决了原始单模态聚合（UMA）方法的局限性，该方法在英文等令牌可能与声学帧对齐不佳的语言方面存在困难。UMA-Split引入了一个拆分模块，允许每个聚合帧映射到多个令牌，从而改进了跨语言的表示学习和性能。 AI

影响引入了一种提高跨语言语音识别准确性的新方法。

排序理由该集群描述了一篇提出新语音识别模型的新研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Ying Fang, Xiaofei Li · 2026-06-18 04:00

UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

arXiv:2509.14653v2 Announce Type: replace Abstract: This paper proposes a unimodal aggregation (UMA) based nonautoregressive model for both English and Mandarin speech recognition. The original UMA explicitly segments and aggregates acoustic frames (with unimodal weights that fir…

报道来源 [1]

UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition

相关实体

相关话题