实体
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI
PulseAugur coverage of FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI — every cluster mentioning FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
2
90 天内 2
层级分布 · 90 天
最近 · 第 1/1 页 · 共 2 条
-
AI Co-Mathematician accelerates research with agentic support for mathematicians
Researchers have developed an AI co-mathematician system designed to assist mathematicians in their research workflows. This system provides comprehensive support for tasks such as ideation, literature review, computati…
-
OpenAI's GPT-5.2 advances science and math, with evaluations showing low catastrophic risk
OpenAI has released GPT-5.2, a new model demonstrating significant advancements in mathematical and scientific reasoning. The model achieved high scores on benchmarks like GPQA Diamond and FrontierMath, indicating impro…