Qwen3-30B-A3B
PulseAugur coverage of Qwen3-30B-A3B — every cluster mentioning Qwen3-30B-A3B across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
New method allows MoE models to skip over half of experts
Researchers have developed a new framework called Zero-Expert Self-Distillation Adaptation (ZEDA) to make existing Mixture-of-Experts (MoE) language models more efficient. ZEDA allows post-trained static MoE models to d…
-
MoE models misroute tokens on complex reasoning tasks, study finds
Researchers have identified a significant issue in Mixture-of-Experts (MoE) language models where the routing mechanism, which directs tokens to specific experts, often selects suboptimal paths. While the standard route…
-
Researchers propose efficient LLM classification probes to reduce latency and VRAM
Researchers have developed a method to integrate classification tasks, such as safety checks, directly into the forward pass of large language models (LLMs). This approach uses lightweight probes trained on the LLM's in…