Brief · PulseAugur

RESEARCH · arXiv cs.AI English(EN) · 5d · [6 sources]

Proof-Refactor: Refactoring Generated Formal Proofs into Modular Artifacts

Researchers have developed new frameworks to enhance formal theorem proving capabilities using large language models. Goedel-Architect utilizes a blueprint generation and refinement strategy, achieving state-of-the-art performance on benchmarks like MiniF2F-test and PutnamBench with the DeepSeek-V4-Flash model. Proof-Refactor focuses on improving the modularity, readability, and maintainability of LLM-generated proofs, outperforming existing baselines on the PutnamBench dataset. Another approach, Compile to Compress, leverages compiler outputs to refine proof attempts efficiently, achieving top results on PutnamBench with smaller models. AI

IMPACT These advancements in AI-driven formal theorem proving could accelerate mathematical discovery and software verification.

Guchan Li
PutnamBench
arXiv
Claude Code
LLMs
Putnam2025
Proof-Refactor
DeepSeek-V4-Flash
Goedel-Architect
Compile to Compress
MiniF2F-test