New method efficiently expands LLMs to more languages via MoE architecture

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-18 08:59

Researchers have developed a new method to efficiently expand Large Language Models (LLMs) to support more languages without extensive retraining. The technique involves converting a dense model into a Mixture-of-Experts (MoE) architecture, with different experts handling different languages. This approach allows for the integration of new language capabilities through post-training parameter deltas, bypassing the need for complex alignment phases and preserving the model's original abilities. AI

影响 This method could significantly reduce the cost and complexity of making LLMs multilingual, potentially accelerating global access to advanced AI capabilities.

排序理由 The cluster contains an academic paper detailing a new method for LLM language expansion. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

LLMs

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Shujian Huang · 2026-05-18 08:59

A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$Δ$ Integration into Upcycled MoE

Expanding Large Language Models~(LLMs) to new languages is a costly endeavor, demanding extensive Continued Pre-Training~(CPT) and data-intensive alignment. While recent data-free merging techniques attempt to bypass alignment by fusing a multilingual CPT-enhanced model with its …

报道来源 [1]

A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$Δ$ Integration into Upcycled MoE

相关实体

相关话题