Grounding-Driven Attack: Improving Encoder-based Adversarial Transferability against Large Vision-Language Models
Researchers have developed a new method called Grounding-Driven Attack (GDA) to improve the transferability of adversarial attacks against large vision-language models (LVLMs). Existing attacks often assume similar encoder architectures, but GDA focuses on text-conditioned grounding regions, which are more stable across different LVLM architectures. The proposed method allocates perturbation budgets to these grounded regions and intensifies their disruption, demonstrating superior performance in black-box scenarios. AI
IMPACT This research highlights a vulnerability in vision-language models and proposes a more effective attack strategy, potentially influencing future robustness evaluations and defense mechanisms.