PulseAugur
实时 11:17:44
English(EN) How to Leverage Synthetic Speech for LLM-Based ASR Systems?

面向LLM时代的新ASR基准和训练方法出现

研究人员正在开发新的方法来改进自动语音识别(ASR)系统,特别是在专业领域。一种方法侧重于利用合成语音来训练银行业和医疗保健等受监管行业的ASR模型,通过减少对真实敏感录音的依赖来解决隐私问题。另一项进展推出了PreferenceASR,这是一个新的测试集,旨在评估ASR系统在遵守用户定义的数字、非流利语、实体和大小写输出风格方面的能力,揭示了传统基准无法捕捉到的性能差异。 AI

影响 ASR训练和评估的进步可能导致跨各种应用的更准确和可定制的语音识别系统。

排序理由 两篇学术论文介绍了ASR系统的新方法和数据集。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

面向LLM时代的新ASR基准和训练方法出现

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Yanis Labrak, Dairazalia Sanchez-Cortes, Sergio Burdisso, S\'everin Baroudi, Shashi Kumar, Esa\'u Villatoro-Tello, Srikanth Madikeri, Manjunath K E, Old\v{r}ich Plchot, Kadri Hacio\u{g}lu, Petr Motlicek, Andreas Stolcke ·

    How to Leverage Synthetic Speech for LLM-Based ASR Systems?

    arXiv:2606.29031v1 Announce Type: cross Abstract: In regulated domains such as banking and healthcare, where privacy constraints make real speech costly to collect and retain, synthetic speech from modern text-to-speech (TTS) is an appealing alternative for training automatic spe…

  2. arXiv cs.CL TIER_1 English(EN) · Nithin Rao Koluguri, Sasha Meister, Nikolay Karpov, Piotr Zelasko, Desh Raj, Jagadeesh Balam, Boris Ginsburg ·

    Preference-ASR: A Preference-Aware Test Set for Benchmarking ASR in the Era of Speech LLMs

    arXiv:2606.29534v1 Announce Type: new Abstract: Popular ASR test sets adopt inconsistent conventions for numbers, disfluencies, entities, and casing, while standard normalizers erase the format distinctions users care about. Current benchmarks therefore cannot measure whether a m…