PulseAugur
实时 20:23:34

LLMs' formalization accuracy improved with roundtrip verification and repair

研究人员开发了一种新颖的往返验证方法,用于评估大型语言模型生成的自然语言形式化的忠实度。该技术涉及将形式化语句翻译回自然语言,重新形式化,然后使用形式化工具检查两个形式化之间的逻辑等价性。当出现差异时,会采用诊断和修复过程,这使得 Claude Opus 4.6GPT-5.2 等模型的形式等价性从 45-61% 显著提高到 83-85%。 AI

影响 引入了一种提高 LLM 在形式化任务中忠实度的方法,有可能提高代码生成和逻辑推理的可靠性。

排序理由 学术论文,介绍了一种用于 LLM 形式化的新验证方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

LLMs' formalization accuracy improved with roundtrip verification and repair

报道来源 [2]

  1. arXiv cs.CL TIER_1 English(EN) · Daneshvar Amrollahi, Jerry Lopez, Clark Barrett ·

    Faithful Autoformalization via Roundtrip Verification and Repair

    arXiv:2604.25031v1 Announce Type: new Abstract: When an LLM formalizes natural language, how do we know the output is faithful? We propose a roundtrip verification approach which does not require ground-truth annotations: formalize a statement, translate the result back to natura…

  2. arXiv cs.CL TIER_1 English(EN) · Clark Barrett ·

    Faithful Autoformalization via Roundtrip Verification and Repair

    When an LLM formalizes natural language, how do we know the output is faithful? We propose a roundtrip verification approach which does not require ground-truth annotations: formalize a statement, translate the result back to natural language, re-formalize, and use a formal tool …