English(EN) PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection

新研究探讨LLM漏洞检测，提高准确性并分析提示词敏感性

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-26 04:00

两篇新研究论文探讨了使用大型语言模型（LLMs）进行软件漏洞检测。第一篇论文介绍了VULPO，一个新颖的on-policy优化框架，它使用新的数据集ContextVul，通过考虑上下文信息和推理轨迹来提高LLM识别漏洞的性能。专门的LLM VULPO-4B的性能显著优于现有方法。第二篇论文提出了PromptAudit，一个用于评估提示词敏感性如何影响基于LLM的漏洞检测的框架，发现虽然链式思考提示词（chain-of-thought prompting）通常有效，但提示词的变化会显著改变模型的性能和可靠性。 AI

影响这些研究突显了在代码安全方面使用LLM的进展，可能导致更强大的自动化漏洞检测工具。

排序理由两篇在arXiv上发表的学术论文，详细介绍了使用LLM进行漏洞检测的新方法和分析。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Youpeng Li, Fuxun Yu, Weiliang Qi, Xinda Wang · 2026-05-28 04:00

VULPO：通过 On-Policy LLM 优化实现上下文感知漏洞检测

arXiv:2511.11896v3 Announce Type: replace-cross Abstract: Large language models (LLMs) have recently shown strong potential in vulnerability detection (VD). However, accurately detecting vulnerabilities in real-world repositories requires reasoning over complex contextual interac…
arXiv cs.AI TIER_1 English(EN) · Steffen J. Camarato, Yahya Hmaiti, Mandana Ghadamian, David Mohaisen · 2026-05-26 04:00

PromptAudit：审计LLM驱动的漏洞检测中的提示词敏感性

arXiv:2605.24171v1 Announce Type: cross Abstract: Large language models are increasingly used for vulnerability detection, yet their reliability under different prompt formulations remains uncharacterized. We present PromptAudit, a controlled evaluation framework that isolates pr…

报道来源 [2]

VULPO：通过 On-Policy LLM 优化实现上下文感知漏洞检测

PromptAudit：审计LLM驱动的漏洞检测中的提示词敏感性

相关实体

相关话题