Semantic Router: On the Feasibility of Hijacking MLLMs via a Single Adversarial Perturbation
Researchers have developed a novel attack method called Semantic-Aware Hijacking that can compromise Multimodal Large Language Models (MLLMs) with a single adversarial perturbation. This technique, termed Semantic-Aware Universal Perturbation (SAUP), functions as a semantic router, directing inputs to attacker-defined targets. Experiments on models like Qwen demonstrated a 66% success rate in hijacking five distinct targets with a single perturbation. AI
IMPACT This research highlights a significant vulnerability in MLLMs, potentially impacting their deployment in safety-critical applications like autonomous driving and robotics.