New benchmark reveals multilingual safety gaps in vision-language models

By PulseAugur Editorial · [1 sources] · 2026-06-09 04:00

Researchers have developed MLingualFC, a new multilingual benchmark to test the safety vulnerabilities of vision-language models (VLMs). This benchmark uses flowchart images encoded with harmful instructions in five languages: Hindi, Punjabi, Spanish, Romanian, and German. Evaluations of models like Qwen2.5-VL, Gemma-4, and Pangea revealed that visual attacks are highly successful in Latin-script languages, indicating current safety measures do not generalize well across languages and modalities. AI

IMPACT Highlights the need for more robust, multilingual safety alignment in advanced AI models.

RANK_REASON The cluster contains an academic paper introducing a new benchmark for evaluating AI model safety. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Rishabh Makwana, Mamta, Deeksha Varshney, Oana Cocarascu · 2026-06-09 04:00

MLingualFC: Evaluating Jailbreak Vulnerabilities in Multilingual Vision-Language Models

arXiv:2606.07706v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) have demonstrated strong performance across multimodal tasks, yet their safety robustness remains an open challenge. While prior work has shown that structured visual prompts such as flowcharts can ef…

COVERAGE [1]

MLingualFC: Evaluating Jailbreak Vulnerabilities in Multilingual Vision-Language Models

RELATED ENTITIES

RELATED TOPICS