Baidu has open-sourced a new OCR model called Unlimited OCR, which excels at processing long documents by mimicking human reading habits. Unlike traditional OCR systems that process documents page by page and then stitch results together, Unlimited OCR uses a novel Reference Sliding Window Attention (R-SWA) mechanism. This allows it to maintain a continuous reading state without the memory and computational overhead that typically increases with document length, setting a new state-of-the-art on the OmniDocBench benchmark. AI
IMPACT Introduces a novel approach to long-context AI memory management, potentially impacting various sequence-based AI tasks beyond OCR.
RANK_REASON New OCR model release from a major tech company (Baidu) with novel attention mechanism and benchmark performance claims. [lever_c_demoted from frontier_release: ic=1 ai=1.0]
- Baidu
- DeepSeek
- DeepSeek OCR
- GLM-OCR
- OmniDocBench
- PaddleOCR
- Reference Sliding Window Attention
- Unlimited OCR
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →