MobileMoE models set new efficiency standard for on-device LLMs

By PulseAugur Editorial · [4 sources] · 2026-05-26 00:00

Researchers have introduced MobileMoE, a new family of on-device Mixture-of-Experts (MoE) language models designed for mobile deployment. These models, with sub-billion active parameters, establish a new performance frontier for on-device LLMs by optimizing MoE architecture for mobile memory and compute constraints. MobileMoE models demonstrate competitive or superior performance across 14 benchmarks compared to leading dense LLMs and existing MoE models, while using significantly fewer FLOPs and parameters. The project also provides the first efficient MoE inference on smartphones, showing substantial speedups in prefill and decode times. AI

IMPACT Establishes a new Pareto frontier for on-device LLMs, potentially accelerating the deployment of advanced AI capabilities on mobile devices.

RANK_REASON The cluster contains an academic paper detailing a new model architecture and its performance benchmarks.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

MobileMoE models set new efficiency standard for on-device LLMs

COVERAGE [4]

arXiv cs.AI TIER_1 English(EN) · Yanbei Chen, Hanxian Huang, Ernie Chang, Jacob Szwejbka, Digant Desai, Zechun Liu, Vikas Chandra, Raghuraman Krishnamoorthi · 2026-05-27 04:00

MobileMoE: Scaling On-Device Mixture of Experts

arXiv:2605.27358v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we presen…
arXiv cs.AI TIER_1 English(EN) · Raghuraman Krishnamoorthi · 2026-05-26 17:58

MobileMoE: Scaling On-Device Mixture of Experts

Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present MobileMoE, a family of on-device MoE language mo…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-26 17:58

MobileMoE: Scaling On-Device Mixture of Experts

Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present MobileMoE, a family of on-device MoE language mo…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-26 00:00

MobileMoE: Scaling On-Device Mixture of Experts

MobileMoE introduces efficient on-device Mixture-of-Experts language models with sub-billion parameters that achieve better performance and efficiency compared to dense baselines and existing MoE models.

COVERAGE [4]

MobileMoE: Scaling On-Device Mixture of Experts

MobileMoE: Scaling On-Device Mixture of Experts

MobileMoE: Scaling On-Device Mixture of Experts

MobileMoE: Scaling On-Device Mixture of Experts

RELATED ENTITIES

RELATED TOPICS