Lean-GAP: A Dataset of Formalized Graduate Algebra Problems
Researchers have developed Lean-GAP, a dataset containing 430 formalized graduate-level algebra problems derived from the textbook "Abstract Algebra" by Dummit and Foote. The process involved a pipeline for PDF-to-LaTeX preprocessing and autoformalization into Lean 4, though human oversight was crucial for verification. This work contributes a structured dataset, a methodology for formalizing mathematical texts, and an analysis of challenges in translating informal statements to formal language, including comparisons of autoformalization models. AI
IMPACT Formalizing complex mathematical texts could enable more robust AI reasoning and verification in advanced academic domains.