English(EN) Subtitle-Aligned Fine-Tuning of Whisper for Swiss German ASR: Benchmark Contamination, Convention Mismatch, and an Honest Baseline at 25.6% WER (13.8% cWER)

Whisper 微调改进瑞士德语 ASR，揭示基准缺陷

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-09 04:00

研究人员开发了一种新方法，用于微调 OpenAI 的 Whisper 模型以改进瑞士德语自动语音识别 (ASR)。他们的方法使用标准德语字幕作为弱监督，在严格不相交的数据测试集上实现了 25.6% 的词错误率 (WER)。一项协调的错误分析显示内容 WER 为 13.8%，表明实际错误率显著低于测量的 WER。研究还发现，由于基准污染，现有的瑞士德语 ASR 最先进结果被夸大了，一个普通的 Whisper 模型在没有专门的瑞士德语训练的情况下实现了较低的 WER。 AI

影响强调了低资源语言改进 ASR 的潜力以及严格基准评估的必要性。

排序理由详细介绍 ASR 新方法和基准分析的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Felix Akeret · 2026-06-09 04:00

Subtitle-Aligned Fine-Tuning of Whisper for Swiss German ASR: Benchmark Contamination, Convention Mismatch, and an Honest Baseline at 25.6% WER (13.8% cWER)

arXiv:2606.07608v1 Announce Type: cross Abstract: We present a systematic study of fine-tuning OpenAI's Whisper large-v3 for Swiss German ASR, using 1,367 hours of broadcast speech paired with Standard German subtitles as weak supervision. Through 16 iterative training runs on an…

报道来源 [1]

Subtitle-Aligned Fine-Tuning of Whisper for Swiss German ASR: Benchmark Contamination, Convention Mismatch, and an Honest Baseline at 25.6% WER (13.8% cWER)

相关实体

相关话题