Baidu releases Unlimited OCR for advanced long-document parsing

By PulseAugur Editorial · [3 sources] · 2026-06-19 09:40

Baidu has released Unlimited OCR, a new model designed for advanced document parsing. This model leverages a constant KV cache mechanism to achieve state-of-the-art performance, particularly on long documents. It is available on Hugging Face and integrates with popular libraries like Transformers and inference providers such as vLLM and SGLang, offering flexible deployment options including Docker. AI

IMPACT This release offers improved long-document parsing capabilities, potentially benefiting industries dealing with extensive textual data.

RANK_REASON Model release from a significant AI lab (Baidu) with a specific name and capability. [lever_c_demoted from frontier_release: ic=2 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Baidu releases Unlimited OCR for advanced long-document parsing

COVERAGE [3]

Hugging Face Trending Models TIER_1 (ET) · baidu · 2026-06-19 09:40

baidu/Unlimited-OCR

image-text-to-text · 47 downloads · 55 likes
Pandaily TIER_1 English(EN) · [email protected] (Pandaily) · 2026-06-23 08:15

Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-23 10:34

Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Atten

Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Attention, it compresses memory from linear to constant growth, achieving 93.92 percent on the OmniDocBench benchmark. The 3B…

LINKS pandaily.com/baidu-unlimited-ocr-constant…

COVERAGE [3]

baidu/Unlimited-OCR

Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Atten

RELATED ENTITIES

RELATED TOPICS