New FBHM benchmark reveals VLM weaknesses in hateful meme detection

By PulseAugur Editorial · [2 sources] · 2026-05-29 14:27

Researchers have developed a new benchmark called FBHM to better evaluate the capabilities of vision-language models (VLMs) in detecting hateful memes. Existing benchmarks often confuse rhetorical strategies with target community features, hindering causal analysis of VLM vulnerabilities. FBHM, comprising 5,000 memes across 25 functionalities and 10 target communities, reveals that current state-of-the-art VLMs perform poorly on this new dataset, indicating they rely on dataset-specific heuristics rather than robust multimodal reasoning. To address this, the researchers propose LSV, a low-data strategy using as few as 500 samples to significantly improve VLM performance on FBHM. AI

IMPACT This benchmark could drive the development of more robust multimodal reasoning in AI, improving safety and reducing harmful content generation.

RANK_REASON The cluster contains a research paper introducing a new benchmark and methodology for evaluating AI models.

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Paramananda Bhaskar, Naquee Rizwan, Daksh Jogchand, Saurabh Kumar Pandey, Animesh Mukherjee · 2026-06-01 04:00

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

arXiv:2605.31349v1 Announce Type: cross Abstract: Hateful meme detection remains a formidable challenge for vision-language models, as existing benchmarks are structurally observational - confounding rhetorical hate mechanisms with target community features and preventing causal …
arXiv cs.AI TIER_1 English(EN) · Animesh Mukherjee · 2026-05-29 14:27

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

Hateful meme detection remains a formidable challenge for vision-language models, as existing benchmarks are structurally observational - confounding rhetorical hate mechanisms with target community features and preventing causal evaluation of model vulnerabilities. To address th…

COVERAGE [2]

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection

RELATED ENTITIES

RELATED TOPICS