English(EN) Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling

新的Dense2MoE框架优化端侧大语言模型

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-27 04:00

研究人员开发了Dense2MoE，一个统一剪枝和升级技术以创建高效的端侧大语言模型（LLMs）的新框架。该方法解决了从头开始训练MoE模型的高成本和现有升级方法的低效率问题。通过剪枝带宽密集型注意力模块并将MLP重新用作MoE专家，Dense2MoE旨在提高资源受限设备的推理效率和准确性。 AI

影响这项研究可能带来更强大、更高效的端侧应用LLMs，从而改善用户体验和可访问性。

排序理由这是一篇详细介绍创建高效端侧LLMs新方法的学术论文。

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Fengfa Li, Hongjin Ji, Yifeng Ding, Lei Ren, Chen Wei · 2026-05-27 04:00

Dense2MoE: Pushing the Pareto Frontier of On-Device LLMs via Unified Pruning and Upcycling

arXiv:2605.26496v1 Announce Type: cross Abstract: The Mixture of Experts MoE architecture is highly promising for resource constrained on device deployments yet training these models from scratch incurs prohibitive costs Current methods attempt to alleviate this by upcycling dens…