Researchers have developed MerLean-Prover, an end-to-end theorem prover for Lean 4 that generates kernel-checkable proofs. The system utilizes a recursive loop with three agent types (Planning, Check, and Lean) and has demonstrated strong performance on benchmarks like FormalQualBench and Putnam2025. Notably, MerLean-Prover achieved 10/23 on FormalQualBench, outperforming existing open-source baselines, and successfully solved all 12 problems on Putnam2025 with reduced computation time. The harness design also proved effective with smaller models, including Sonnet and Haiku. AI
RANK_REASON The cluster contains an academic paper detailing a new theorem-proving system and its benchmark results.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →