ENTITY mixture of experts

mixture of experts

PulseAugur coverage of mixture of experts — every cluster mentioning mixture of experts across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

111

111 over 90d

Releases · 30d

0 over 90d

Papers · 30d

87 over 90d

TIER MIX · 90D

frontier release 8
significant 3
research 43
tool 54
commentary 3

TOPICS

paper 87
model release 63
infra 35
product 20
other 15
safety 8
funding 1

RELATIONSHIPS

instance of Mixture of Experts (MoE) 95%
instance of arXiv 90%
used by large-language models 90%
instance of Innu-aimun 90%
instance of DeepSeek-V2-Lite 90%
instance of DeepSeek MoE 90%
instance of DeepSeek V4-Flash 90%
instance of Emo 90%
used by SGLang 90%
uses large-language models 80%
instance of large-language models 70%
instance of transformers 70%

TIMELINE

2026-05-11 research_milestone A new paper proposes an enhanced Mixture-of-Experts framework for faster time series forecasting model training. source

SENTIMENT · 30D

20 day(s) with sentiment data

RECENT · PAGE 6/6 · 111 TOTAL

RESEARCH · CL_06713 · Apr 28 · 04:00

New framework uses multiple LLMs to reduce hallucination and bias

Researchers have developed a new framework called Council Mode designed to mitigate hallucinations and biases in Large Language Models. This approach involves querying multiple diverse LLMs simultaneously and then synth…
RESEARCH · CL_06701 · Apr 28 · 04:00

New simulator stress-tests AI emotional support chatbots with diverse user profiles

Researchers have developed a new controllable simulator to better evaluate emotional support chatbots. This simulator addresses limitations in current systems by incorporating diverse psychological and linguistic featur…
FRONTIER RELEASE · CL_07710 · Apr 27 · 19:49

NVIDIA launches Nemotron 3 Nano Omni, unifying multimodal AI for efficiency

NVIDIA has released Nemotron 3 Nano Omni, an open multimodal model capable of processing text, images, audio, and video. This model aims to unify these modalities into a single architecture, improving efficiency and ena…
RESEARCH · CL_06215 · Apr 27 · 03:23

SMoES improves MoE-VLM efficiency and effectiveness with soft modality guidance

Researchers have introduced SMoES, a novel approach for guiding expert routing in Mixture-of-Experts (MoE) vision-language models (VLMs). This method utilizes dynamic soft modality scores to account for layer-dependent …
RESEARCH · CL_03609 · Apr 24 · 16:44

Researchers propose new methods to decouple model parameters from computation

Researchers have introduced novel methods to decouple model size from computational cost in deep learning. One approach, 'hash layers,' allows for larger models with fewer computational operations by using hashing for e…
FRONTIER RELEASE · CL_00752 · Apr 24 · 13:30

DeepSeek previews new AI model that ‘closes the gap’ with frontier models

DeepSeek has released its V4 AI model, featuring two versions: V4-Pro and V4-Flash. These models boast a 1 million token context window and utilize a mixture-of-experts architecture for efficiency. While DeepSeek V4 aim…
RESEARCH · CL_05070 · Apr 24 · 11:35

AI research explores functorial formulations, causal learning, and adaptive model merging

Researchers have developed a multi-fidelity surrogate modeling framework to predict wind loads on container ships, combining empirical data with CFD simulations for improved accuracy and reduced computational cost. Anot…
RESEARCH · CL_02843 · Apr 22 · 13:37

New MoE Architectures Enhance Efficiency and Performance

Researchers are developing advanced techniques to improve Mixture-of-Experts (MoE) models, particularly addressing challenges in domain transitions and inference efficiency. One approach, inspired by the Free Energy Pri…
RESEARCH · CL_03266 · Apr 22 · 00:05

Cohere details how MoE models boost speculative decoding effectiveness

Cohere has released a technical report detailing how Mixture-of-Experts (MoE) models can enhance speculative decoding. Contrary to initial expectations, the research indicates that MoE architectures actually improve the…
FRONTIER RELEASE · CL_47594 · Apr 13 · 09:12

Qwen releases 27B multimodal model for advanced coding

Qwen has released Qwen3.6-27B, a dense 27-billion-parameter multimodal model designed for advanced coding tasks. This model aims to provide flagship-level agentic coding performance, surpassing previous open-source mode…
FRONTIER RELEASE · CL_00821 · Jan 19 · 04:00

DeepSeek v3 leads open-weight models, Baseten enables mission-critical inference

DeepSeek v3, a new 671B parameter Mixture-of-Experts model, has been released and is currently the top-performing open-weights model available. Serving such large models presents significant challenges, but inference st…

New framework uses multiple LLMs to reduce hallucination and bias

New simulator stress-tests AI emotional support chatbots with diverse user profiles

NVIDIA launches Nemotron 3 Nano Omni, unifying multimodal AI for efficiency

SMoES improves MoE-VLM efficiency and effectiveness with soft modality guidance

Researchers propose new methods to decouple model parameters from computation

DeepSeek previews new AI model that ‘closes the gap’ with frontier models

AI research explores functorial formulations, causal learning, and adaptive model merging

New MoE Architectures Enhance Efficiency and Performance

Cohere details how MoE models boost speculative decoding effectiveness

Qwen releases 27B multimodal model for advanced coding

DeepSeek v3 leads open-weight models, Baseten enables mission-critical inference