Researchers have introduced REALM, a novel benchmark designed to evaluate the vulnerabilities of physical-world Vision-Language Models (VLMs). This benchmark unifies 12 red-teaming methods, 3 defenses, and 13 VLMs under a black-box threat model, utilizing shared datasets and metrics for fair comparison. REALM employs an agentic target-generation pipeline to create scenario-specific, physically grounded attack objectives, revealing that text and typographic injection attacks are most effective, while model scale alone does not guarantee adversarial robustness. AI
IMPACT Establishes a standardized method for assessing the safety and robustness of VLMs in physical-world applications.
RANK_REASON The item is a research paper introducing a new benchmark for evaluating AI models. [lever_c_demoted from research: ic=1 ai=1.0]
- alphaXiv
- arXiv
- CatalyzeX
- Connected Papers
- DagsHub
- Gotit.pub
- Hugging Face
- Influence Flower
- Litmaps
- REALM
- ScienceCast
- scite Smart Citations
- Vision--Language Models
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →