VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation
Researchers have developed a new framework called VATS to exploit vulnerabilities in how AI models handle tool errors. This method systematically mutates error messages to inject malicious instructions, bypassing standard safety measures. In tests with leading models like Gemini 3.1 Pro and GPT-5.5, this error-path injection technique significantly increased the success rate of prompt injection attacks, reaching up to 100% in some evaluations. While current production safeguards can offer some protection, the underlying susceptibility in the models themselves presents a risk to custom AI agent workflows. AI
IMPACT New attack vector identified that could compromise AI agent security and reliability.