PulseAugur
实时 08:36:41
English(EN) Redesign Mixture-of-Experts Routers with Manifold Power Iteration

新的MPI方法增强了MoE模型的路由有效性

研究人员开发了一种名为流形幂迭代(MPI)的新方法来重新设计专家混合(MoE)模型中的路由器。该技术将每个路由器行与其关联的专家对齐到主奇异方向,旨在改进令牌路由到专家的方式。理论分析表明,MPI将路由器行驱动到这些主方向,并且在1B到11B参数的MoE模型上的实证测试表明,这种对齐可以带来更有效的模型。 AI

影响 这项研究通过改进其路由机制,有望带来更高效、更有效的专家混合模型。

排序理由 该集群包含一篇详细介绍改进MoE模型新方法的学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

报道来源 [3]

  1. arXiv cs.AI TIER_1 English(EN) · Songhao Wu, Ang Lv, Ruobing Xie, Yankai Lin ·

    Redesign Mixture-of-Experts Routers with Manifold Power Iteration

    arXiv:2606.12397v1 Announce Type: cross Abstract: Router is the cornerstone component to the Mixture-of-Experts models. Serving as expert proxies, the rows of the router matrix compute their similarity to the MoE inputs to determine which subset of experts is activated. Ideally, …

  2. arXiv cs.AI TIER_1 English(EN) · Yankai Lin ·

    使用流形功率迭代重新设计专家混合路由器

    Router is the cornerstone component to the Mixture-of-Experts models. Serving as expert proxies, the rows of the router matrix compute their similarity to the MoE inputs to determine which subset of experts is activated. Ideally, each router row is designed to encode the expert m…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    Redesign Mixture-of-Experts Routers with Manifold Power Iteration

    Researchers propose a novel router redesign for Mixture-of-Experts models that aligns router rows with the principal singular directions of expert matrices using Manifold Power Iteration to improve model effectiveness.