Brief · PulseAugur

TOOL · r/LocalLLaMA English(EN) · 6h

Scaling former VibeThinker-1.5B to 3B — now it reaches frontier math & coding performance

A research effort has scaled the VibeThinker model from 1.5 billion parameters to 3 billion, achieving performance comparable to frontier models in specific domains like mathematics and coding. The VibeThinker-3B model demonstrated strong results on benchmarks such as AIME'26, LiveCodeBench v6, IMO-AnswerBench, and IFEval, and achieved a 96.1% success rate on recent LeetCode programming contests. Researchers suggest that while small models have limitations in general-purpose applications, they can offer a path to advanced reasoning in parameter-dense areas with clear verification signals, complementing traditional scaling laws. AI

IMPACT Demonstrates that small, parameter-dense models can achieve frontier-level reasoning in specialized domains, potentially offering cost-effective alternatives to larger models.

VibeThinker-3B
VibeThinker-1.5B