MLLMs pose urgent threat to CAPTCHA security, new paper finds

By PulseAugur Editorial · [1 sources] · 2026-06-15 04:00

A new research paper details how multimodal large language models (MLLMs) can effectively solve visual CAPTCHAs, posing a significant security risk. The study evaluated seven MLLMs across 18 CAPTCHA types, finding that current models can solve many recognition-oriented and low-interaction CAPTCHAs with human-like cost and speed. Researchers propose defense strategies, including incorporating fine-grained localization and implicit counting, which reduced MLLM success rates from over 95% to 0% on a hardened CAPTCHA type. The paper emphasizes the urgent need to redesign CAPTCHAs as MLLM capabilities advance. AI

RANK_REASON The cluster contains a research paper detailing new findings and proposed solutions. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Junyu Wang, Changjia Zhu, Yuanbo Zhou, Lingyao Li, Xu He, Mingkui Wei, Junjie Xiong · 2026-06-15 04:00

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

arXiv:2512.02318v4 Announce Type: replace-cross Abstract: This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface where an adversary can cheaply automate CAPTCHA solving using off-the-shel…

COVERAGE [1]

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

RELATED ENTITIES

RELATED TOPICS