English(EN) Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

新的LLM-ASR框架提升多语言语音识别

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-09 05:35

研究人员开发了一个利用大型语言模型（LLMs）的多语言自动语音识别（ASR）新框架。该系统采用混合专家（MoE）架构来提升跨语言性能，并使用连续整合与触发（CIF）机制进行动态降采样和模态对齐。该方法旨在创建更准确、更鲁棒的基于LLM的ASR系统，相比现有模型有显著改进。 AI

影响引入了使用LLMs提升多语言ASR性能的新技术，可能增强语音技术的全球可访问性。

排序理由该集群包含一篇详细介绍LLM-ASR新技术的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Guodong Lin, Ziqi Chen, Yuxiang Fu, Ke Li, Wei-Qiang Zhang · 2026-06-10 04:00

Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

arXiv:2606.10439v1 Announce Type: cross Abstract: The rapid progress of large language models (LLMs) has opened up a new frontier for automatic speech recognition (ASR), making their effective integration a critical and challenging research direction. To this end, this work propo…
arXiv cs.CL TIER_1 English(EN) · Wei-Qiang Zhang · 2026-06-09 05:35

Enhancing Multilingual LLM-based ASR with Mixture of Experts and Dynamic Downsampling

The rapid progress of large language models (LLMs) has opened up a new frontier for automatic speech recognition (ASR), making their effective integration a critical and challenging research direction. To this end, this work proposes a projector-based LLM-ASR framework targeting …