PulseAugur
LIVE 06:51:49
ENTITY mixture of experts

mixture of experts

PulseAugur coverage of mixture of experts — every cluster mentioning mixture of experts across labs, papers, and developer communities, ranked by signal.

Total · 30d
11
11 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
8
8 over 90d
TIER MIX · 90D
RELATIONSHIPS
TIMELINE
  1. 2026-05-11 research_milestone A new paper proposes an enhanced Mixture-of-Experts framework for faster time series forecasting model training. source
SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 1/3 · 49 TOTAL
  1. COMMENTARY · CL_29758 ·

    MoE architectures are workarounds for LLM training instability, not ideal solutions

    Mixture-of-Experts (MoE) architectures are often presented as an efficient solution for scaling large language models, but this analysis argues they are primarily a workaround for training instability in dense transform…

  2. RESEARCH · CL_28307 ·

    New research optimizes Sparse Mixture-of-Experts for efficient LLM scaling

    Researchers are exploring new methods to optimize Sparse Mixture-of-Experts (SMoE) models, which are crucial for scaling large language models efficiently. One paper reveals a geometric coupling between routers and expe…

  3. TOOL · CL_27710 ·

    New MoE framework speeds up time series forecasting training

    Researchers have developed a new Mixture-of-Experts (MoE) framework designed to accelerate the training of time series forecasting models. This method integrates expert-specific loss information directly into the traini…

  4. TOOL · CL_25314 ·

    UC Berkeley and AI2 propose EMO for emergent modularity in MoE models

    Researchers from UC Berkeley and the Allen Institute for AI have introduced EMO, a method that encourages emergent modularity in Mixture of Experts (MoE) models through pre-training. This approach investigates how struc…

  5. SIGNIFICANT · CL_23645 ·

    DeepSeek releases open-source coding model matching GPT-4o

    DeepSeek has released V3-0324, an open-source coding model that matches or surpasses leading models like GPT-4o and Claude 3.5 Sonnet in coding performance. This Mixture-of-Experts model, with 671 billion total paramete…

  6. TOOL · CL_25610 ·

    MoE models misroute tokens on complex reasoning tasks, study finds

    Researchers have identified a significant issue in Mixture-of-Experts (MoE) language models where the routing mechanism, which directs tokens to specific experts, often selects suboptimal paths. While the standard route…

  7. TOOL · CL_21909 ·

    Graph Normalization offers differentiable approximation for NP-hard MWIS problem

    Researchers have developed Graph Normalization (GN), a novel dynamical system that approximates the NP-hard Maximum Weight Independent Set (MWIS) problem. GN offers a principled and differentiable approach, converging t…

  8. RESEARCH · CL_22189 ·

    EMO model enables modularity in large language models with selective expert use

    Researchers have developed EMO, a novel Mixture-of-Experts (MoE) model designed for emergent modularity. Unlike traditional monolithic large language models, EMO activates only specific subsets of its parameters for dif…

  9. TOOL · CL_22046 ·

    New MoE inference design uses pooled HBM to cut communication latency on Ascend

    Researchers have developed a new communication design for Mixture-of-Experts (MoE) inference on Ascend systems, aiming to reduce bottlenecks in token exchange. This approach eliminates intermediate relay and reordering …

  10. RESEARCH · CL_21995 ·

    New SAMoE-C method improves CSI-based HAR with scene-adaptive experts

    Researchers have developed a new method called Scene-Adaptive Mixture of Experts with Clustered Specialists (SAMoE-C) to improve human activity recognition using channel state information (CSI). This approach addresses …

  11. TOOL · CL_21907 ·

    New research explores finite expert banks for communication-efficient MoE architectures

    Researchers have developed a new framework for analyzing sparse Mixture-of-Experts (MoE) architectures, focusing on communication efficiency. They propose treating the MoE gate as a stochastic channel and quantifying ro…

  12. RESEARCH · CL_21794 ·

    New parameter E predicts Mixture-of-Experts model health, preventing dead experts.

    Researchers have introduced a new dimensionless control parameter, E = T*H/(O+B), to predict the health of expert ecologies in Mixture-of-Experts (MoE) models. This parameter, derived from four hyperparameters, can prev…

  13. TOOL · CL_20870 ·

    Zyphra's ZAYA1-8B MoE model trained on AMD hardware outperforms larger rivals

    Zyphra AI has released ZAYA1-8B, a Mixture of Experts (MoE) language model with 760 million active parameters and 8.4 billion total parameters. Trained on AMD hardware, this model demonstrates competitive performance ag…

  14. TOOL · CL_20383 ·

    LAWS architecture offers self-certifying inference caching for LLMs and robotics

    Researchers have introduced LAWS, a novel caching architecture designed to improve the efficiency of neural inference, robotics, and edge deployments. This system builds a library of certified expert functions by observ…

  15. TOOL · CL_20549 ·

    Tropical geometry reveals sparsity is combinatorial depth in MoE models

    A new paper introduces a theoretical framework for understanding Mixture-of-Experts (MoE) models using tropical geometry. The research establishes that the routing mechanism in MoE architectures is equivalent to a speci…

  16. TOOL · CL_20547 ·

    MoLF model predicts pan-cancer gene expression from histology images

    Researchers have developed MoLF, a novel generative model designed for predicting pan-cancer spatial gene expression from histology images. This model utilizes a conditional Flow Matching objective and a Mixture-of-Expe…

  17. RESEARCH · CL_20274 ·

    Geometry-aware model advances whole-slide image analysis in computational pathology

    Researchers have developed BatMIL, a novel framework for analyzing whole-slide histopathological images. This approach utilizes a hybrid hyperbolic-Euclidean representation to better capture hierarchical tissue structur…

  18. RESEARCH · CL_18472 ·

    NVIDIA open-sources cuDNN kernels after 12 years, including MoE and sparse attention

    NVIDIA has open-sourced parts of its cuDNN library, a significant move after 12 years of it being closed-source. This release includes over 20 Mixture-of-Experts (MoE) kernels and NSA sparse attention kernels. The codeb…

  19. TOOL · CL_18630 ·

    SMoE paper proposes expert substitution for efficient edge MoE deployment

    Researchers have developed SMoE, a novel algorithm-system co-design aimed at enabling Mixture of Experts (MoE) models to run on edge devices. This approach tackles memory limitations by dynamically offloading experts an…

  20. TOOL · CL_20119 ·

    Apple researchers unveil SpecMD for faster MoE model inference

    Apple's machine learning research team has published a paper detailing SpecMD, a new framework for evaluating Mixture-of-Experts (MoE) model caching policies. Their experiments show that traditional caching assumptions …