PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection
Researchers have developed PromptAudit, a framework to assess how prompt variations affect Large Language Models (LLMs) used for vulnerability detection. Their study, which tested five prompting strategies on five open-weight models using 1,000 CVEs across 16 programming languages, revealed that standard chain-of-thought prompting yielded the best results. The findings indicate that prompt sensitivity is a critical factor in LLM performance for vulnerability detection and should be a key consideration during evaluation and deployment. AI
IMPACT Highlights the critical role of prompt engineering in ensuring the reliability and accuracy of LLMs for security applications.