New attack method targets adversarial transferability in vision-language models

By PulseAugur Editorial · [1 sources] · 2026-05-26 04:00

Researchers have developed a new method called Grounding-Driven Attack (GDA) to improve the transferability of adversarial attacks against large vision-language models (LVLMs). Existing attacks often assume similar encoder architectures, but GDA focuses on text-conditioned grounding regions, which are more stable across different LVLM architectures. The proposed method allocates perturbation budgets to these grounded regions and intensifies their disruption, demonstrating superior performance in black-box scenarios. AI

IMPACT This research highlights a vulnerability in vision-language models and proposes a more effective attack strategy, potentially influencing future robustness evaluations and defense mechanisms.

RANK_REASON The cluster contains a research paper detailing a new method for adversarial attacks on large vision-language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CV TIER_1 English(EN) · Xinwei Zhang, Li Bai, Tianwei Zhang, Youqian Zhang, Qingqing Ye, Yingnan Zhao, Ruochen Du, Haibo Hu · 2026-05-26 04:00

Grounding-Driven Attack: Improving Encoder-based Adversarial Transferability against Large Vision-Language Models

arXiv:2602.09431v2 Announce Type: replace-cross Abstract: Large vision-language models (LVLMs) have achieved impressive performance across multimodal tasks, but their reliance on visual inputs exposes them to adversarial threats. Encoder-based attacks provide an efficient alterna…

COVERAGE [1]

Grounding-Driven Attack: Improving Encoder-based Adversarial Transferability against Large Vision-Language Models

RELATED ENTITIES

RELATED TOPICS