English(EN) Progressive Alignment Objectives for Aligner-Encoder based ASR

新的自动语音识别方法InterAligner提高了训练稳定性和减少了错误

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-23 05:09

研究人员开发了一种名为InterAligner的新方法，以提高基于对齐器-编码器的自动语音识别（ASR）模型的训练稳定性和性能。该方法引入了一个中间对齐器目标和一个中间CTC损失，使得对齐过程能够在模型层之间渐进地形成，而不是突然发生。在LibriSpeech数据集上使用17层Conformer进行测试时，InterAligner在test-clean/other上的词错误率（WER）分别为3.1%/5.6%，优于以前的方法，尤其是在处理较长的语音时。 AI

影响这项研究可能带来更强大、更准确的语音识别系统，尤其是在处理较长的音频输入时。

排序理由该集群包含一篇详细介绍自动语音识别模型新方法的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Jaeyong Lee, Masato Mimura, Takafumi Moriya · 2026-06-24 04:00

基于对齐器-编码器的语音识别的渐进式对齐目标

arXiv:2606.24147v1 Announce Type: cross Abstract: Aligner-Encoders are recently proposed seq2seq end-to-end ASR models that replace decoder attention by predicting the uth token directly from the u-th encoder position, so the encoder must learn the alignment internally without cr…
arXiv cs.CL TIER_1 English(EN) · Takafumi Moriya · 2026-06-23 05:09

基于对齐器-编码器的自动语音识别的渐进式对齐目标

Aligner-Encoders are recently proposed seq2seq end-to-end ASR models that replace decoder attention by predicting the uth token directly from the u-th encoder position, so the encoder must learn the alignment internally without cross-attention or a transducer lattice. In practice…

报道来源 [2]

基于对齐器-编码器的语音识别的渐进式对齐目标

基于对齐器-编码器的自动语音识别的渐进式对齐目标

相关实体

相关话题