Alibaba's Qwen 3 models range from 0.6B to 235B parameters, outperforming R1 and o1

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Alibaba's Qwen team has released Qwen 3, a suite of models ranging from 0.6 billion to 235 billion parameters. These models, including both full and Mixture-of-Experts (MoE) variants, reportedly surpass previous versions like R1 and o1 in performance. The release offers a range of sizes to cater to different computational needs and applications. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON Release of a new suite of open-source models from a major tech company's AI lab, with performance claims that warrant research-level classification.

Read on Smol AINews →

COVERAGE [1]

Smol AINews TIER_1 · 2025-04-28 05:44

Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1

**Qwen 3** has been released by **Alibaba** featuring a range of models including two MoE variants, **Qwen3-235B-A22B** and **Qwen3-30B-A3B**, which demonstrate competitive performance against top models like **DeepSeek-R1**, **o1**, **o3-mini**, **Grok-3**, and **Gemini-2.5-Pro*…

COVERAGE [1]

Qwen 3: 0.6B to 235B MoE full+base models that beat R1 and o1

RELATED TOPICS