Mixture of Experts (MoE)
PulseAugur coverage of Mixture of Experts (MoE) — every cluster mentioning Mixture of Experts (MoE) across labs, papers, and developer communities, ranked by signal.
5 天有情绪数据
-
Hugging Face details AI advancements in models, agents, and transformers
Hugging Face is publishing a series of blog posts detailing advancements in AI. These include new models and techniques for multimodal embeddings, improved interactive world generation for GPUs, and strategies for AI pr…
-
Complete-muE framework optimizes hyperparameter transfer for MoE models
Researchers have introduced Complete-muE, a novel framework designed to optimize hyperparameter transfer for Mixture-of-Experts (MoE) models. This system addresses the limitations of existing tools by enabling effective…
-
Dynamic TMoE framework improves time series forecasting with adaptive experts
Researchers have developed Dynamic TMoE, a novel framework designed to improve non-stationary time series forecasting. This approach addresses the limitations of existing Mixture-of-Experts (MoE) models by dynamically a…
-
Fireworks AI enables training of trillion-parameter MoE models
Fireworks AI has developed a new training infrastructure that enables the fine-tuning of trillion-parameter Mixture-of-Experts (MoE) models, overcoming previous memory and orchestration bottlenecks. This platform was in…
-
New method allows MoE models to skip over half of experts
Researchers have developed a new framework called Zero-Expert Self-Distillation Adaptation (ZEDA) to make existing Mixture-of-Experts (MoE) language models more efficient. ZEDA allows post-trained static MoE models to d…
-
New $\phi$-balancing framework improves MoE model training
Researchers have introduced a new framework called $\phi$-balancing to improve the training of Mixture-of-Experts (MoE) models. This method aims to achieve better expert utilization by directly targeting population-leve…
-
EMO framework eases MoE training by expanding expert pool progressively
Researchers have introduced EMO, a novel framework for training Mixture-of-Experts (MoE) models that progressively expands the expert pool during training. This approach addresses the inefficiency paradox in MoE models,…