EVA: Evolving Semantic Adversaries for Red-Teaming GUI Agents Against Environmental Injection Attacks
Researchers have developed EVA, an evolutionary framework designed to identify semantic vulnerabilities in GUI agents powered by multimodal large language models (MLLMs). This method focuses on manipulating the semantic understanding of agents rather than their visual perception, achieving up to an 85% success rate in attacks. EVA rapidly evolves adversarial payloads within the model's latent space, highlighting a paradox where alignment training can make agents more susceptible to deceptive semantic cues. AI
IMPACT Reveals a critical alignment paradox where agents trained for instruction-following are vulnerable to semantic deception, potentially impacting future AI safety research.