GDPval-AA
PulseAugur coverage of GDPval-AA — every cluster mentioning GDPval-AA across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
GLM-5.2 leads open weights models on real-world agentic work benchmark · 2 sources tracked
GLM-5.2 has emerged as the most popular new model on the Fireworks AI platform over the past week. This open-weights model has achieved the third overall position on the GDPval-AA benchmark, which evaluates performance …
-
Fireworks AI offers Zhipu AI's GLM-5.2, top open-weights coding model
Fireworks AI has announced that GLM-5.2 is now available on its inference platform, highlighting its performance as the top-ranked open-weights model for coding and third overall on the GDPval-AA benchmark. The model, d…
-
Google DeepMind releases Gemini 3.5 Flash for faster agentic tasks
Google DeepMind has launched Gemini 3.5 Flash, a new frontier intelligence model optimized for speed and agentic tasks. This model excels at complex, long-horizon tasks in coding and agent development, outperforming pre…
-
X launches Grok 4.3 with improved agentic performance and lower price
xAI has released Grok-4.3, a new iteration of its AI model, which offers improved agentic performance and a lower price point compared to its predecessor. The model achieved a significant increase of 321 ELO points on t…
-
Xiaomi's MiMo-v2.5-Pro open-source model rivals top AI coding assistants
Xiaomi has released MiMo-v2.5-Pro, an open-source coding-focused language model that demonstrates impressive capabilities in complex tasks. The model successfully completed a university-level compiler project in hours, …