PulseAugur
EN
LIVE 07:37:38

New method prunes MoE language models using generic text corpora

Researchers have developed a new method called Generic TB-Coverage for pruning sparsely activated Mixture-of-Experts (MoE) language models. This technique addresses the challenge of removing redundant experts without requiring specific downstream calibration data. By utilizing generic text corpora like WikiText2 and C4, Generic TB-Coverage profiles per-expert utility separately on each corpus and ensures that high-utility experts from each are retained. This approach has shown improvements in average accuracy and reduced perplexity degradation on models such as Qwen1.5-MoE-A2.7B and DeepSeek-MoE-16B-Base, particularly under aggressive pruning scenarios. AI

IMPACT This method could enable more efficient deployment of large MoE models by reducing their size without significant performance loss.

RANK_REASON The cluster contains a research paper detailing a new method for pruning language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New method prunes MoE language models using generic text corpora

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Yongqin Zeng, Sicheng Pan, Jiale Wang, Hai-tao Zheng, Hong-Gee Kim, Chunxia Ma, XiuTeng Zhou ·

    Generic Expert Coverage for Pruning SparseMixture-of-Experts Language Models

    arXiv:2607.01710v1 Announce Type: new Abstract: Sparsely activated Mixture-of-Experts (MoE) language models contain substantial structured redundancy among routed experts, but pruning them without downstream calibration data remains challenging. Existing expert-pruning methods ty…