Evaluating the Robustness of Proof Autoformalization in Lean 4
A new study on arXiv evaluates the robustness of proof autoformalization models, which translate natural language mathematical proofs into formal languages like Lean 4. Researchers introduced global and local perturbations to informal proofs to test model consistency and faithfulness. The evaluation found that seven recent models were sensitive to global paraphrasing and largely failed to accurately reflect local changes in symbols or proof steps. AI