GDPval-AA
PulseAugur coverage of GDPval-AA — every cluster mentioning GDPval-AA across labs, papers, and developer communities, ranked by signal.
-
X launches Grok 4.3 with improved agentic performance and lower price
xAI has released Grok-4.3, a new iteration of its AI model, which offers improved agentic performance and a lower price point compared to its predecessor. The model achieved a significant increase of 321 ELO points on t…
-
Xiaomi's MiMo-v2.5-Pro open-source model rivals top AI coding assistants
Xiaomi has released MiMo-v2.5-Pro, an open-source coding-focused language model that demonstrates impressive capabilities in complex tasks. The model successfully completed a university-level compiler project in hours, …
-
Anthropic's Claude 3.5 Sonnet 4.6 upgrades capabilities; Cursor valuation soars
Anthropic has released Claude 3.5 Sonnet 4.6, an upgrade to their previous Sonnet 4.5 model. This new version boasts broad improvements across coding, computer use, and long-context reasoning, and includes a 1 million t…