English(EN) What Counts as an Error? Dual-Reference Benchmarking for Atypical ASR

新的ASR基准测试方法突显了非典型语音识别的差异

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-30 04:15

一篇新发表在arXiv上的研究论文介绍了一种用于自动语音识别（ASR）系统的双参考基准测试方法，特别解决了非典型语音的挑战。研究强调，大多数ASR评估将逐字转录和意图转录参考混淆，可能错误地表示模型性能。通过在口吃语音上使用逐字和意图参考对11个ASR模型进行基准测试，研究表明模型排名根据所选参考风格的不同存在显著差异。这突显了为准确的模型评估选择适当的转录参考的至关重要性，尤其是在涉及非典型语音的用例中。 AI

影响强调了对ASR系统进行细致评估的必要性，可能影响未来非典型语音的开发和基准测试标准。

排序理由发表在arXiv上的研究论文，介绍了一种用于ASR系统的新基准测试方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Hawau Olamide Toyin, Srinivasan Umesh, Hanan Aldarmaki · 2026-07-01 04:00

什么是错误？用于非典型ASR的双参考基准测试

arXiv:2606.31112v1 Announce Type: new Abstract: ASR systems have been often reported to underperform on atypical speech. An often conflated compounding factor is the existence of two valid transcription references: verbatim (actual produced speech, including repetitions/prolongat…
arXiv cs.CL TIER_1 English(EN) · Hanan Aldarmaki · 2026-06-30 04:15

什么是错误？针对非典型ASR的双参考基准测试

ASR systems have been often reported to underperform on atypical speech. An often conflated compounding factor is the existence of two valid transcription references: verbatim (actual produced speech, including repetitions/prolongations) and intended (the canonical form of the te…

报道来源 [2]

什么是错误？用于非典型ASR的双参考基准测试

什么是错误？针对非典型ASR的双参考基准测试

相关实体

相关话题