PulseAugur
实时 16:42:18
English(EN) Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

新方法改进多语言词级语音对齐

研究人员开发了一种新颖的多语言词级强制对齐方法,集成了Massively Multilingual Speech (MMS) 模型中的表示和一个自监督音素边界检测器。该方法使用学习型动态规划解码器来推断精确的词边界。该系统在TIMIT和Buckeye数据集上与Montreal Forced Aligner (MFA) 等现有方法相比,表现出优越的性能,并在未见过(unseen)的语言上显示出有希望的结果,表明其可扩展性可覆盖MMS支持的1100多种语言。 AI

影响 提高了多语言语音处理的准确性,可能改进跨语言AI应用。

排序理由 该集群包含一篇详细介绍语音对齐新方法的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Roy Weber, Meidan Zehavi, Rotem Rousso, Joseph Keshet ·

    Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

    arXiv:2606.10675v1 Announce Type: new Abstract: We present a method for accurate multilingual word-level forced alignment, consisting of an alignment encoder and a learned alignment decoder. The encoder integrates two representations: one from the Massively Multilingual Speech (M…

  2. arXiv cs.CL TIER_1 English(EN) · Joseph Keshet ·

    Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming

    We present a method for accurate multilingual word-level forced alignment, consisting of an alignment encoder and a learned alignment decoder. The encoder integrates two representations: one from the Massively Multilingual Speech (MMS) model and another from a self-supervised pho…