Researchers have investigated the impact of natural language variations on Lean 4 autoformalization, finding that semantically equivalent paraphrases can lead to different formal outputs. Their study, using GPT-family models and open-weight autoformalizers on ProofNet# and miniF2F datasets, revealed that these sensitivities are primarily due to compilation failures rather than semantic disagreements. The findings suggest that future efforts should focus on improving the compilation process rather than the semantic layer of these systems. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Suggests focusing training on compilation rather than semantic layers for autoformalization tools.
RANK_REASON Academic paper on autoformalization in Lean 4.