PulseAugur
实时 19:51:39
实体 mixture of experts

mixture of experts

PulseAugur coverage of mixture of experts — every cluster mentioning mixture of experts across labs, papers, and developer communities, ranked by signal.

Show in brief
总计 · 30天
73
90 天内 73
发布 · 30天
0
90 天内 0
论文 · 30天
59
90 天内 59
层级分布 · 90 天
关系
时间线
  1. 2026-05-11 research_milestone A new paper proposes an enhanced Mixture-of-Experts framework for faster time series forecasting model training. 来源
情绪 · 30 天

12 天有情绪数据

最近 · 第 1/4 页 · 共 73 条
  1. SIGNIFICANT · CL_49808 ·

    Meta releases Llama 4 with Mixture of Experts architecture

    Meta has released Llama 4 in April 2025, featuring a new Mixture of Experts (MoE) architecture. Two variants, Scout and Maverick, are available, with Scout serving as a balanced default and Maverick offering broader kno…

  2. TOOL · CL_48811 ·

    ZipMoE system enables efficient on-device serving of large language models

    Researchers have developed ZipMoE, a system designed to make Mixture-of-Experts (MoE) large language models more efficient for on-device deployment. ZipMoE utilizes lossless compression and a cache-affinity scheduling a…

  3. TOOL · CL_48045 ·

    Fireworks AI flags numerical drift in LLM training vs. serving

    Fireworks AI has identified critical numerical parity bugs that can arise when training and serving large language models, particularly Mixture-of-Experts (MoE) architectures. These discrepancies, stemming from the non-…

  4. TOOL · CL_44132 ·

    Alibaba's Qwen3-Coder-Next achieves 70.6 on SWE-Bench with sparse MoE

    Alibaba's Qwen3-Coder-Next, an 80 billion parameter model with 3 billion active parameters, has achieved a 70.6 score on the SWE-Bench Verified benchmark. This performance is notable as it rivals top closed-source model…

  5. RESEARCH · CL_48284 ·

    New decoding method tackles hallucinations in vision-language models

    Researchers have developed a new inference-time framework called CHASd to combat hallucinations in Large Vision-Language Models (LVLMs). This method, Contrastive Hallucination-Aware Step-wise Decoding, selectively activ…

  6. TOOL · CL_44778 ·

    Research quantifies LLM performance, energy, and privacy trade-offs on mobile devices

    A new research paper explores the trade-offs between performance, energy consumption, and privacy when running large language models on mobile devices. The study developed an experimental pipeline to measure these facto…

  7. RESEARCH · CL_44669 ·

    New research tackles continual learning in LLMs with novel MoE methods

    Two new research papers propose novel approaches to continual learning in large language and vision-language models, aiming to mitigate catastrophic forgetting. CP-MoE introduces a transient expert to guide updates and …

  8. TOOL · CL_49356 ·

    SpikingMoE integrates Mixture-of-Experts into spike-driven Transformers

    Researchers have introduced SpikingMoE, a novel framework that combines Spiking Neural Networks (SNNs) with a Mixture-of-Experts (MoE) architecture. This approach utilizes a spike-driven prompt (SDprompt) for biological…

  9. RESEARCH · CL_44023 ·

    FAME framework uses LLMs for efficient log anomaly detection

    Researchers have developed FAME, a novel framework for message-level log anomaly detection that significantly reduces the need for manual labeling. This system utilizes a Mixture-of-Experts approach, employing large lan…

  10. RESEARCH · CL_42192 ·

    OpenAI o3 disproves conjecture, eyes $850B IPO; Cohere releases MoE model

    OpenAI's latest model, o3, has reportedly disproven an Erdős conjecture through extensive reasoning. Concurrently, OpenAI is rumored to be preparing for an IPO with a valuation of $850 billion. In related news, Cohere h…

  11. TOOL · CL_41462 ·

    AI efficiency vs. interpretability: a sparse vs. dense tradeoff

    The human brain's extreme energy efficiency, estimated to be 10,000 times greater than current AI models, is attributed to its sparse and localized processing. While techniques like mixture-of-experts offer a path towar…

  12. RESEARCH · CL_42129 ·

    New research enables efficient hyperparameter transfer for large neural networks

    Researchers have developed new methods for hyperparameter transfer, enabling more efficient scaling of large neural networks. One paper introduces a parameterization justified by dynamical mean-field theory, allowing re…

  13. TOOL · CL_42518 ·

    FedCoE framework balances generalization and personalization in Federated Learning

    Researchers have introduced FedCoE, a novel framework for Federated Learning that aims to balance global generalization with local personalization. Unlike traditional methods that struggle with non-IID data or overfit t…

  14. RESEARCH · CL_41759 ·

    New tool DODOCO reveals flaws in MoE model dispatch benchmarks

    A new research paper introduces DODOCO, a tool designed to diagnose overhead in dispatch operations for Mixture-of-Experts (MoE) models. The study found that common assumptions about workload representation in benchmark…

  15. TOOL · CL_41905 ·

    New HDMoE framework enhances cancer survival prediction with multimodal data

    Researchers have developed a new framework called HDMoE to improve multimodal cancer survival prediction. This hierarchical decoupling-fusion mixture-of-experts approach aims to better integrate data from sources like w…

  16. RESEARCH · CL_41793 ·

    Dynamic TMoE framework improves time series forecasting with adaptive experts

    Researchers have developed Dynamic TMoE, a novel framework designed to improve non-stationary time series forecasting. This approach addresses the limitations of existing Mixture-of-Experts (MoE) models by dynamically a…

  17. RESEARCH · CL_41804 ·

    Vision MoE models show stable animate-inanimate expert specialization

    Researchers have developed new methods to analyze the internal workings of Mixture-of-Experts (MoE) models in computer vision. Their work moves beyond simply examining how data is routed to specific "experts" within the…

  18. TOOL · CL_41191 ·

    New MoE framework enhances brain decoding with network-aware experts

    Researchers have developed FPED, a novel Mixture-of-Experts (MoE) framework designed for interpretable brain decoding using fMRI data. This approach explicitly models different functional brain networks as specialized e…

  19. FRONTIER RELEASE · CL_33854 ·

    DeepSeek V4 debuts with MegaMoE optimizations for efficient MoE

    DeepSeek has released its V4 model, featuring significant optimizations through a new system called MegaMoE. This system utilizes a 1400-line fused CUDA kernel to enhance performance by fine-grained pipelining of commun…

  20. RESEARCH · CL_36345 ·

    New $\phi$-balancing framework improves MoE model training

    Researchers have introduced a new framework called $\phi$-balancing to improve the training of Mixture-of-Experts (MoE) models. This method aims to achieve better expert utilization by directly targeting population-leve…