PulseAugur
实时 14:59:57
English(EN) Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

新的LLM-ASR框架提升多语言语音识别

研究人员开发了一个利用大型语言模型(LLMs)的多语言自动语音识别(ASR)新框架。该系统采用混合专家(MoE)架构来提升跨语言性能,并使用连续整合与触发(CIF)机制进行动态降采样和模态对齐。该方法旨在创建更准确、更鲁棒的基于LLM的ASR系统,相比现有模型有显著改进。 AI

影响 引入了使用LLMs提升多语言ASR性能的新技术,可能增强语音技术的全球可访问性。

排序理由 该集群包含一篇详细介绍LLM-ASR新技术的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Guodong Lin, Ziqi Chen, Yuxiang Fu, Ke Li, Wei-Qiang Zhang ·

    Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

    arXiv:2606.10439v1 Announce Type: cross Abstract: The rapid progress of large language models (LLMs) has opened up a new frontier for automatic speech recognition (ASR), making their effective integration a critical and challenging research direction. To this end, this work propo…

  2. arXiv cs.CL TIER_1 English(EN) · Wei-Qiang Zhang ·

    Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

    The rapid progress of large language models (LLMs) has opened up a new frontier for automatic speech recognition (ASR), making their effective integration a critical and challenging research direction. To this end, this work proposes a projector-based LLM-ASR framework targeting …