GDPval-AA
PulseAugur coverage of GDPval-AA — every cluster mentioning GDPval-AA across labs, papers, and developer communities, ranked by signal.
1 天有情绪数据
-
Google DeepMind releases Gemini 3.5 Flash for faster agentic tasks
Google DeepMind has launched Gemini 3.5 Flash, a new frontier intelligence model optimized for speed and agentic tasks. This model excels at complex, long-horizon tasks in coding and agent development, outperforming pre…
-
X launches Grok 4.3 with improved agentic performance and lower price
xAI has released Grok-4.3, a new iteration of its AI model, which offers improved agentic performance and a lower price point compared to its predecessor. The model achieved a significant increase of 321 ELO points on t…
-
小米的MiMo-v2.5-Pro开源模型可与顶级AI编码助手相媲美
小米发布了MiMo-v2.5-Pro,这是一款专注于编码的开源语言模型,在复杂任务中展现出令人印象深刻的能力。该模型在数小时内成功完成了一个大学级别的编译器项目,根据模糊的提示构建了一个功能齐全的视频编辑器应用程序,并解决了模拟电路设计问题。MiMo-v2.5-Pro在编码基准测试中表现强劲,可与GPT-5.4和Claude Opus 4.6等顶级闭源模型相媲美,现已在HuggingFace上发布。