Fireworks AI has developed a new training infrastructure that enables the fine-tuning of trillion-parameter Mixture-of-Experts (MoE) models, overcoming previous memory and orchestration bottlenecks. This platform was instrumental in the recent release of Cursor's Composer 2.5, a coding model that achieved top performance on several benchmarks. The system utilizes techniques like low-precision expert quantization and optimizer state offloading to manage the memory demands of large MoE models, making them more accessible for training and fine-tuning. AI
IMPACT Enables training of trillion-parameter MoE models, potentially accelerating the development of more capable frontier models.
RANK_REASON Fireworks AI's blog post details their infrastructure for training large MoE models, which was used to train Cursor's Composer 2.5.
- Composer 2
- Cursor
- Fireworks AI
- Kimi K2.5
- Mixture-of-Experts (MoE) models
- Qwen3-30B
- Composer 2.5
- CursorBench
- Mixture-of-Experts (MoE)
- SWE-bench Multilingual
- Terminal-Bench
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →