ENTITY mixture of experts

mixture of experts

PulseAugur coverage of mixture of experts — every cluster mentioning mixture of experts across labs, papers, and developer communities, ranked by signal.

Total · 30d

11 over 90d

Releases · 30d

0 over 90d

Papers · 30d

8 over 90d

TIER MIX · 90D

frontier release 2
research 4
tool 5

RELATIONSHIPS

instance of arXiv 90%
instance of Innu-aimun 90%
instance of transformers 90%
used by transformer 70%
instance of transformer 70%
instance of LLM 70%
developed by Emo 70%
used by Emo 70%
affiliated with transformers 50%
other LLM 50%

TIMELINE

2026-05-11 research_milestone A new paper proposes an enhanced Mixture-of-Experts framework for faster time series forecasting model training. source

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 2/3 · 49 TOTAL

RESEARCH · CL_18667 · May 5 · 17:21

RD-ViT cuts data needs for segmentation, outperforming standard ViT with fewer parameters

Researchers have developed RD-ViT, a novel Recurrent-Depth Vision Transformer designed for semantic segmentation tasks. This architecture significantly reduces data dependence by using a single, shared transformer block…
RESEARCH · CL_18682 · May 5 · 13:05

OneTrackerV2 unifies multimodal visual tracking with Dual Mixture-of-Experts

Researchers have developed a new event-based visual object tracking framework that addresses limitations of existing methods by explicitly modeling event density variations across multiple temporal scales. This approach…
TOOL · CL_16235 · May 5 · 04:00

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Researchers have developed a new framework called RAST-MoE-RL to improve efficiency in ride-hailing services. This framework utilizes a Mixture-of-Experts (MoE) approach within deep reinforcement learning to better hand…
TOOL · CL_15969 · May 5 · 04:00

Attention Sink research reveals inherent MoE structure in LLM attention layers

Researchers have identified that the attention sink phenomenon in Large Language Models, where the first token receives disproportionate attention, naturally forms a Mixture-of-Experts (MoE) mechanism within attention l…
RESEARCH · CL_14912 · May 4 · 19:00

Xiaomi unveils MiMo-V2.5-Pro AI model for automated programming tasks

Xiaomi has unveiled its MiMo-V2.5-Pro language model, designed to automate complex programming tasks. Leveraging a Mixture-of-Experts architecture and reduced token requirements, the model can handle processes that prev…
RESEARCH · CL_15510 · May 4 · 14:26

Mamoda2.5 model integrates multimodal AI with efficient DiT-MoE for top video editing

Researchers have introduced Mamoda2.5, a unified AR-Diffusion framework designed for multimodal understanding and generation. This model utilizes a Diffusion Transformer backbone enhanced with a Mixture-of-Experts (MoE)…
RESEARCH · CL_14460 · May 4 · 04:00

Researchers explore quantum neural networks via mixture of experts

Researchers have established a mean-field limit for Mixture of Experts (MoE) models trained using gradient flow in supervised learning scenarios. Their findings demonstrate that as the number of experts increases, the m…
RESEARCH · CL_14045 · May 1 · 17:35

GMGaze model achieves SOTA gaze estimation with CLIP and multiscale transformer

Researchers have introduced GMGaze, a novel approach to gaze estimation that utilizes a multi-scale transformer architecture and incorporates context-aware conditioning. This method addresses limitations in existing mod…
RESEARCH · CL_11925 · May 1 · 04:00

FluxMoE system decouples expert weights for faster LLM serving

Researchers have developed FluxMoE, a new system designed to improve the efficiency of serving Mixture-of-Experts (MoE) models. FluxMoE addresses the challenge of large parameter sizes in MoE models by decoupling expert…
RESEARCH · CL_14183 · Apr 30 · 21:35

Study finds switchless networks more cost-effective for MoE LLM serving

A new paper analyzes network topologies for Mixture-of-Experts (MoE) Large Language Model (LLM) serving, finding that lower-cost, switchless networks can be more cost-effective than expensive scale-up infrastructures. T…
RESEARCH · CL_10240 · Apr 30 · 04:00

Mixture of Experts framework speeds up atomistic simulations

Researchers have developed a new Mixture-of-Experts (MoE) framework for Machine Learning Interatomic Potentials (MLIPs) to accelerate atomistic simulations. This approach divides simulation domains into regions of varyi…
RESEARCH · CL_13954 · Apr 30 · 03:15

Liquid AI releases LFM2-24B-A2B, an efficient 24B parameter MoE model

Liquid AI has released an early checkpoint of its LFM2-24B-A2B model, a sparse Mixture of Experts (MoE) architecture with 24 billion total parameters and 2 billion active parameters per token. This model demonstrates th…
RESEARCH · CL_09867 · Apr 29 · 16:47

FaaSMoE offers resource-efficient, serverless serving for multi-tenant Mixture-of-Experts models.

Researchers have developed FaaSMoE, a novel serverless framework designed for serving Mixture-of-Experts (MoE) models in multi-tenant environments. This architecture deploys individual experts as stateless functions on …
RESEARCH · CL_09878 · Apr 29 · 11:54

New framework uses physics-informed transfer learning for multi-site emission control

Researchers have developed a new physics-informed transfer learning framework designed to improve emission control in municipal solid waste incineration. This framework utilizes a mixture-of-experts model to manage carb…
RESEARCH · CL_08651 · Apr 29 · 04:00

Mixture-of-Experts model applied to GlueX DIRC detector for physics analysis

Researchers have developed a Mixture-of-Experts (MoE) foundation model to streamline data analysis for the GlueX DIRC detector at Jefferson Lab. This unified framework handles fast simulation, particle identification, a…
FRONTIER RELEASE · CL_07750 · Apr 28 · 16:40

NVIDIA launches Nemotron 3 Nano Omni multimodal AI model for agents

NVIDIA has released Nemotron 3 Nano Omni, a multimodal large language model capable of processing vision, audio, video, and text simultaneously. This open model, built on a Mamba2 Transformer Hybrid Mixture of Experts a…
RESEARCH · CL_07734 · Apr 28 · 16:17

Poolside AI releases open-weight Laguna XS.2 and M.1 coding models

Poolside AI has released two new agentic coding models, Laguna M.1 and Laguna XS.2, along with their agent training and operation runtime. Laguna M.1 is a large Mixture of Experts (MoE) model trained on 30T tokens using…
FRONTIER RELEASE · CL_07710 · Apr 28 · 15:58

NVIDIA launches Nemotron 3 Nano Omni, unifying multimodal AI for efficiency

NVIDIA has released Nemotron 3 Nano Omni, an open multimodal model capable of processing text, images, audio, and video. This model aims to unify these modalities into a single architecture, improving efficiency and ena…
RESEARCH · CL_07230 · Apr 28 · 08:00

AI models achieve 10x intelligence gains via Mixture of Experts and Transformer architectures

The Transformer architecture, introduced in the paper "Attention Is All You Need," revolutionized AI by enabling models to process information more efficiently. This innovation is key to understanding how models like Op…
RESEARCH · CL_06713 · Apr 28 · 04:00

New framework uses multiple LLMs to reduce hallucination and bias

Researchers have developed a new framework called Council Mode designed to mitigate hallucinations and biases in Large Language Models. This approach involves querying multiple diverse LLMs simultaneously and then synth…

RD-ViT cuts data needs for segmentation, outperforming standard ViT with fewer parameters

OneTrackerV2 unifies multimodal visual tracking with Dual Mixture-of-Experts

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Attention Sink research reveals inherent MoE structure in LLM attention layers

Xiaomi unveils MiMo-V2.5-Pro AI model for automated programming tasks

Mamoda2.5 model integrates multimodal AI with efficient DiT-MoE for top video editing

Researchers explore quantum neural networks via mixture of experts

GMGaze model achieves SOTA gaze estimation with CLIP and multiscale transformer

FluxMoE system decouples expert weights for faster LLM serving

Study finds switchless networks more cost-effective for MoE LLM serving

Mixture of Experts framework speeds up atomistic simulations

Liquid AI releases LFM2-24B-A2B, an efficient 24B parameter MoE model

FaaSMoE offers resource-efficient, serverless serving for multi-tenant Mixture-of-Experts models.

New framework uses physics-informed transfer learning for multi-site emission control

Mixture-of-Experts model applied to GlueX DIRC detector for physics analysis

NVIDIA launches Nemotron 3 Nano Omni multimodal AI model for agents

Poolside AI releases open-weight Laguna XS.2 and M.1 coding models

NVIDIA launches Nemotron 3 Nano Omni, unifying multimodal AI for efficiency

AI models achieve 10x intelligence gains via Mixture of Experts and Transformer architectures

New framework uses multiple LLMs to reduce hallucination and bias