PulseAugur
实时 11:08:52
English(EN) Rethinking Sparse Mixture of Experts from a Unified Perspective

新框架增强稀疏专家混合模型

两篇新的研究论文提出了用于优化稀疏专家混合(SMoE)模型的新颖框架。第一个,统一稀疏专家混合(USMoE),通过线性规划重构SMoE,以创建统一的机制和分数,从而提高在各种任务和数据类型上的性能。第二个,专家纳什合并(NAMEx),将博弈论和纳什谈判应用于专家合并,以增强协作和效率。NAMEx已在大规模模型如Qwen1.5-MoE和DeepSeek-MoE上证明了其有效性。 AI

影响 SMoE架构的这些进步可能带来更高效、更强大的跨领域AI模型。

排序理由 两篇学术论文提出了改进现有模型架构的新颖方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Giang Do, Hung Le, Truyen Tran ·

    Rethinking Sparse Mixture of Experts from a Unified Perspective

    arXiv:2503.22996v3 Announce Type: replace Abstract: Sparse Mixture of Experts (SMoE) models scale the capacity of models while maintaining constant computational overhead. SMoE methods fall into two categories: Token Choice, which routes each token to a fixed number of experts, a…

  2. arXiv stat.ML TIER_1 English(EN) · Dung V. Nguyen, Anh T. Nguyen, Minh H. Nguyen, Luc Q. Nguyen, Shiqi Jiang, Ethan Fetaya, Linh Duy Tran, Gal Chechik, Tan M. Nguyen ·

    Expert Merging in Sparse Mixture of Experts with Nash Bargaining

    arXiv:2510.16138v2 Announce Type: replace-cross Abstract: Existing expert merging strategies for Sparse Mixture of Experts (SMoE) typically rely on input-dependent or input-independent averaging of expert parameters, but often lack a principled weighting mechanism. In this work, …