ANVIL uses LLMs for anomaly-based vulnerability detection

By PulseAugur Editorial · [1 sources] · 2026-06-30 04:00

Researchers have developed ANVIL, a novel approach to identifying software vulnerabilities by reframing the problem as anomaly detection. Unlike traditional supervised methods that struggle with limited labeled data, ANVIL leverages Large Language Models (LLMs) trained on vast unlabelled code. The system works by having an LLM reconstruct masked code segments, scoring deviations from the original as anomalies. This method has demonstrated superior performance compared to existing supervised detectors on the PrimeVul dataset, achieving up to twice the Top-3 accuracy and significantly improving ROC-AUC. Furthermore, ANVIL's integration with fuzzers has successfully uncovered two previously unknown vulnerabilities, highlighting its practical application in enhancing software security. AI

IMPACT This research could lead to more effective and efficient methods for detecting software vulnerabilities, improving overall code security.

RANK_REASON This is a research paper detailing a new method for vulnerability identification using LLMs. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

ANVIL uses LLMs for anomaly-based vulnerability detection

COVERAGE [1]

arXiv cs.LG TIER_1 English(EN) · Weizhou Wang, Eric Liu, Xiangyu Guo, Xiao Hu, Ilya Grishchenko, David Lie · 2026-06-30 04:00

ANVIL: Anomaly-based Vulnerability Identification without Labelled Training Data

arXiv:2408.16028v4 Announce Type: replace-cross Abstract: Supervised-learning-based vulnerability detectors often fall short due to limited labelled training data. In contrast, Large Language Models (LLMs) are trained on vast unlabelled code corpora, yet perform only marginally b…

COVERAGE [1]

ANVIL: Anomaly-based Vulnerability Identification without Labelled Training Data

RELATED ENTITIES

RELATED TOPICS