Brief · PulseAugur

RESEARCH · arXiv cs.LG English(EN) · 4d · [2 sources]

Structure-Guided Visual Perturbation Neutralization for LVLMs

Researchers have developed new methods to address vulnerabilities in Large Vision-Language Models (LVLMs). One approach, SIGN, is a lightweight defense framework that uses structural extraction and dynamic neutralization to suppress adversarial perturbations in image inputs, achieving a high defense success rate with minimal pixel modification and computational overhead. Another development is MVI-Bench, a comprehensive benchmark designed to evaluate LVLM robustness against misleading visual inputs across different hierarchical levels, revealing significant vulnerabilities in current state-of-the-art models. AI

IMPACT New benchmarks and defense mechanisms are crucial for the safe and reliable deployment of LVLMs in real-world applications.

LVLMs
Huiyi Chen
MVI-Bench