实体
DeepSeek-V2-Lite
DeepSeek-V2-Lite
PulseAugur coverage of DeepSeek-V2-Lite — every cluster mentioning DeepSeek-V2-Lite across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
2
90 天内 2
层级分布 · 90 天
情绪 · 30 天
2 天有情绪数据
最近 · 第 1/1 页 · 共 2 条
-
New tool DODOCO reveals flaws in MoE model dispatch benchmarks
A new research paper introduces DODOCO, a tool designed to diagnose overhead in dispatch operations for Mixture-of-Experts (MoE) models. The study found that common assumptions about workload representation in benchmark…
-
MoE models misroute tokens on complex reasoning tasks, study finds
Researchers have identified a significant issue in Mixture-of-Experts (MoE) language models where the routing mechanism, which directs tokens to specific experts, often selects suboptimal paths. While the standard route…