Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 12h

PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection

Researchers have developed PromptAudit, a framework to assess how prompt variations affect Large Language Models (LLMs) used for vulnerability detection. Their study, which tested five prompting strategies on five open-weight models using 1,000 CVEs across 16 programming languages, revealed that standard chain-of-thought prompting yielded the best results. The findings indicate that prompt sensitivity is a critical factor in LLM performance for vulnerability detection and should be a key consideration during evaluation and deployment. AI

IMPACT Highlights the critical role of prompt engineering in ensuring the reliability and accuracy of LLMs for security applications.

LLM
chain-of-thought prompting
PromptAudit
few-shot prompting