PulseAugur
实时 09:13:08
English(EN) SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

SARA框架通过路由对齐提升MoE多语言性能

研究人员开发了SARA,一个旨在提高专家混合(MoE)模型在低资源语言中性能的新框架。SARA解决了低资源语言的token经常被路由到与高资源语言不同专家的问题,阻碍了跨语言知识迁移。通过使用Jensen-Shannon散度约束,SARA对齐了MoE层的内部路由分布,有效地将专业能力从高资源语言迁移到低资源语言。实验表明,SARA在Qwen3-30B-A3B和Phi-3.5-MoE-instruct等模型的Global-MMLU等基准测试中提升了性能。 AI

影响 增强了稀疏AI架构的多语言能力,可能提高了低资源语言的可访问性和性能。

排序理由 详细介绍改进MoE模型新框架的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

SARA框架通过路由对齐提升MoE多语言性能

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Tianyu Dong, Yangyang Liu, Jiang Zhou, Xinwei Wu, Xiaohu Zhao, Hao Wang, Heng Liu, Linlong Xu, Longyue Wang, Weihua Luo, Shaolin Zhu, Deyi Xiong ·

    SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

    arXiv:2606.25821v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) architectures have emerged as an increasingly influential paradigm as they offer a strategic balance between parameter scalability and computational efficiency. However, low-resource languages, which …

  2. arXiv cs.AI TIER_1 English(EN) · Deyi Xiong ·

    SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

    Sparse Mixture-of-Experts (MoE) architectures have emerged as an increasingly influential paradigm as they offer a strategic balance between parameter scalability and computational efficiency. However, low-resource languages, which suffer from a scarcity of high-quality training …