LLMs' formalization accuracy improved with roundtrip verification and repair

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-27 22:26

研究人员开发了一种新颖的往返验证方法，用于评估大型语言模型生成的自然语言形式化的忠实度。该技术涉及将形式化语句翻译回自然语言，重新形式化，然后使用形式化工具检查两个形式化之间的逻辑等价性。当出现差异时，会采用诊断和修复过程，这使得 Claude Opus 4.6 和 GPT-5.2 等模型的形式等价性从 45-61% 显著提高到 83-85%。 AI

影响引入了一种提高 LLM 在形式化任务中忠实度的方法，有可能提高代码生成和逻辑推理的可靠性。

排序理由学术论文，介绍了一种用于 LLM 形式化的新验证方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Daneshvar Amrollahi, Jerry Lopez, Clark Barrett · 2026-04-29 04:00

Faithful Autoformalization via Roundtrip Verification and Repair

arXiv:2604.25031v1 Announce Type: new Abstract: When an LLM formalizes natural language, how do we know the output is faithful? We propose a roundtrip verification approach which does not require ground-truth annotations: formalize a statement, translate the result back to natura…
arXiv cs.CL TIER_1 English(EN) · Clark Barrett · 2026-04-27 22:26

Faithful Autoformalization via Roundtrip Verification and Repair

When an LLM formalizes natural language, how do we know the output is faithful? We propose a roundtrip verification approach which does not require ground-truth annotations: formalize a statement, translate the result back to natural language, re-formalize, and use a formal tool …

报道来源 [2]

Faithful Autoformalization via Roundtrip Verification and Repair

Faithful Autoformalization via Roundtrip Verification and Repair

相关实体

相关话题