A new benchmark study has evaluated the performance of ten OCR systems, including specialized OCR-VLMs and frontier multimodal LLMs, on Devanagari script. The research found that while many systems perform well on clean synthetic text, their performance degrades significantly under degradation conditions and on real-world scans. Specialized OCR-VLMs proved particularly fragile, with DeepSeek-OCR exhibiting catastrophic repetition failures. Notably, strong performance on English OCR did not correlate with performance on Indic scripts, with models like GPT-5.5 showing a substantial drop. AI
IMPACT Highlights limitations of current multimodal models on non-English scripts, indicating a need for improved multilingual capabilities and robustness.
RANK_REASON Academic paper presenting a new benchmark and study on OCR performance for a specific script. [lever_c_demoted from research: ic=1 ai=1.0]
- Claude Opus 4.7
- DeepSeek-OCR
- Devanagari
- EasyOCR
- Gemini 2.5 Flash
- GPT-5.5
- Mistral OCR
- OCR-VLMs
- olmOCR-7B
- Qwen2.5-VL-3B
- Qwen3-VL-8B
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →