ENTITY miniF2F

miniF2F

PulseAugur coverage of miniF2F — every cluster mentioning miniF2F across labs, papers, and developer communities, ranked by signal.

Total · 30d

9

9 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

7

7 over 90d

TIER MIX · 90D

significant 1
research 2
tool 6

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 1/1 · 9 TOTAL

TOOL · CL_109991 · Jun 25 · 04:00

New benchmark MINIF2F-DAFNY tests LLMs for mathematical theorem proving

Researchers have developed MINIF2F-DAFNY, a new benchmark for evaluating Large Language Models (LLMs) in mathematical theorem proving. This system translates the miniF2F benchmark to Dafny, an auto-active verifier, enab…
SIGNIFICANT · CL_106351 · Jun 21 · 04:58

NVIDIA Nemotron 3 Nano: Open Model for Efficient AI Agents

NVIDIA has released Nemotron 3 Nano, a 30-billion parameter open model designed for efficient reasoning and long-context applications. This model utilizes a hybrid Mixture-of-Experts architecture, activating only a frac…
SIGNIFICANT · CL_100955 · Jun 19 · 16:15

NVIDIA unveils efficient Nemotron 3 LLM family with hybrid architecture

NVIDIA has released two new large language models, Nemotron 3 Nano and Nemotron 3 Ultra, focusing on efficiency and advanced capabilities. Nemotron 3 Nano is a 30B-class model designed for private inference and agentic …
RESEARCH · CL_99600 · Jun 18 · 10:40

Lean Proof Assistant Enhances Reinforcement Learning for Theorem Proving

Researchers have developed a novel method for theorem proving using reinforcement learning, integrating the Lean proof assistant to provide detailed, verified feedback. This approach, termed Process-Verified Reinforceme…
TOOL · CL_93231 · Jun 16 · 04:00

New study tests AI proof formalization models for robustness

A new study on arXiv evaluates the robustness of proof autoformalization models, which translate natural language mathematical proofs into formal languages like Lean 4. Researchers introduced global and local perturbati…
TOOL · CL_74387 · Jun 6 · 04:00

LLMs evaluated for formal math proofs in Lean 4

A new research paper evaluates the performance of various Large Language Models (LLMs) in generating formal mathematical proofs using the Lean 4 theorem prover. The study employed pass@k and refine@k metrics on subsets …
TOOL · CL_70440 · Jun 4 · 04:00

LLM autoformalization struggles with paraphrased inputs

Researchers have investigated the robustness of large language models (LLMs) in autoformalization tasks, specifically their ability to generate formal proofs from natural language statements. The study found that LLMs e…
TOOL · CL_22214 · May 8 · 04:00

New AI method achieves 100% formal validity in theorem autoformalization

Researchers have developed a novel reference-free iterative refinement process for autoformalizing entire mathematical theorems. This method utilizes feedback from theorem provers and LLM-based judges to enhance formal …
RESEARCH · CL_06763 · Apr 28 · 04:00

Lean 4 autoformalization sensitive to surface phrasing, not semantics

Researchers have investigated the impact of natural language variations on Lean 4 autoformalization, finding that semantically equivalent paraphrases can lead to different formal outputs. Their study, using GPT-family m…