PulseAugur
EN
LIVE 16:57:48

New benchmark reveals Vision-Language Models struggle with script consistency

A new benchmark, PuMVR, has been developed to evaluate Vision-Language Models (VLMs) on their ability to handle multiple scripts within a single language. The benchmark, comprising 1,000 parallel image-text instances across Punjabi's Gurmukhi, Shahmukhi, and Roman scripts, reveals a significant 'Script Gap' in 10 state-of-the-art VLMs. These models often perform well in one script but fail in others, with accuracy differences up to 16%. The research proposes the Script Consistency Rate (SCR) as a crucial metric for evaluating script-agnostic VLM performance and ensuring equitable AI access. AI

IMPACT Highlights a critical limitation in current multilingual VLMs, potentially driving development of more script-agnostic AI systems.

RANK_REASON The cluster contains an academic paper introducing a new benchmark and evaluation methodology for AI models.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New benchmark reveals Vision-Language Models struggle with script consistency

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Prabhjot Singh, Bhushan Pawar, Madhu Reddiboina, Rajvee Sheth ·

    Not Truly Multilingual: Script Consistency as a Missing Dimension in VLM Evaluation

    arXiv:2606.17188v1 Announce Type: cross Abstract: Current multilingual evaluations for Vision-Language Models (VLMs) assume a one-to-one mapping between language and orthography, overlooking billions of users of multi-script languages. We introduce PuMVR (Punjabi Multimodal Visua…

  2. arXiv cs.CL TIER_1 English(EN) · Rajvee Sheth ·

    Not Truly Multilingual: Script Consistency as a Missing Dimension in VLM Evaluation

    Current multilingual evaluations for Vision-Language Models (VLMs) assume a one-to-one mapping between language and orthography, overlooking billions of users of multi-script languages. We introduce PuMVR (Punjabi Multimodal Visual Reasoning), a benchmark of 1,000 strictly parall…