ENTITY DeepSeek MoE

DeepSeek MoE

PulseAugur coverage of DeepSeek MoE — every cluster mentioning DeepSeek MoE across labs, papers, and developer communities, ranked by signal.

Total · 30d

4

4 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

3

3 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

TOOL · CL_88381 · Jun 13 · 01:05

Mixture of Experts: Performance Gains with Memory Trade-offs

Mixture of Experts (MoE) models offer a way to achieve high performance with lower computational cost per token by activating only a subset of their parameters. While models like Mixtral 8x7B, DeepSeek-MoE, and Qwen2.5-…
FRONTIER RELEASE · CL_62639 · May 30 · 00:00

JetBrains releases efficient Mellum2 MoE model; research advances MoE techniques

JetBrains has released Mellum2, an open-source 12-billion parameter Mixture-of-Experts (MoE) model optimized for efficient inference in text and code tasks. This model activates only a fraction of its parameters per tok…
TOOL · CL_51135 · May 26 · 04:00

HEAPr algorithm precisely prunes LLM experts, cutting memory needs

Researchers have developed HEAPr, a new pruning algorithm designed to reduce the memory footprint of Mixture-of-Experts (MoE) large language models. Unlike previous methods that prune entire experts, HEAPr breaks down e…
TOOL · CL_29430 · May 12 · 08:57

New framework enhances MoE LLMs on noisy analog hardware

Researchers have introduced ROMER, a post-training calibration framework designed to enhance the robustness of Mixture-of-Experts (MoE) Large Language Models (LLMs) when deployed on analog Compute-in-Memory (CIM) system…