Smol AI's Kimi K2 model achieves SOTA with 1T parameters and 15T tokens

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Smol AI has released Kimi K2, an open Mixture-of-Experts (MoE) model. This model demonstrates the capability to scale up to 15 trillion tokens and 1 trillion parameters. The release highlights advancements in open-source large language model development. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON Release of an open-source model from a non-frontier lab.

Read on Smol AINews →

COVERAGE [1]

Smol AINews TIER_1 · 2025-07-11 05:44

Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params

**Moonshot AI** has released **Kimi K2**, a **1 trillion parameter** Mixture-of-Experts model trained on **15.5 trillion tokens** using the new **MuonClip** optimizer, achieving state-of-the-art results on benchmarks like **SWE-Bench Verified (65.8%)** and **TAU2 (58.4%)**. This …

COVERAGE [1]

Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params

RELATED TOPICS