PulseAugur
实时 16:50:56
English(EN) From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification

AI 模型在形式化代码生成中实现高验证成功率

研究人员开发了一个新的数据集 NL2VC-60,包含 60 个算法问题,旨在帮助从自然语言生成已验证的代码。他们评估了七个开源大语言模型(LLM),采用了多种提示策略,包括利用 Dafny 验证器反馈的自修复提示。这种方法显著提高了性能,其中 Gemma 4-31B 的验证成功率达到了 90.91%,而 GPT-OSS 120B 在引导反馈下达到了 81.82%。 AI

影响 增强了 LLM 生成代码的可靠性,有望加速高置信度软件的开发。

排序理由 该集群描述了一篇学术论文,该论文介绍了一种用于 AI 辅助代码生成和形式化验证的新数据集和评估方法。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

AI 模型在形式化代码生成中实现高验证成功率

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Md Erfan, Md Kamal Hossain Chowdhury, Ahmed Ryan, Md Rayhanur Rahman ·

    From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification

    arXiv:2604.22601v1 Announce Type: cross Abstract: Large Language Models (LLMs) show promise in automated software engineering, yet their guarantee of correctness is frequently undermined by erroneous or hallucinated code. To enforce model honesty, formal verification requires LLM…

  2. arXiv cs.AI TIER_1 English(EN) · Md Rayhanur Rahman ·

    From Natural Language to Verified Code: Toward AI Assisted Problem-to-Code Generation with Dafny-Based Formal Verification

    Large Language Models (LLMs) show promise in automated software engineering, yet their guarantee of correctness is frequently undermined by erroneous or hallucinated code. To enforce model honesty, formal verification requires LLMs to synthesize implementation logic alongside for…