PulseAugur
EN
LIVE 11:35:05

Baidu releases Unlimited OCR for advanced long-document parsing

Baidu has released Unlimited OCR, a new model designed for advanced document parsing. This model leverages a constant KV cache mechanism to achieve state-of-the-art performance, particularly on long documents. It is available on Hugging Face and integrates with popular libraries like Transformers and inference providers such as vLLM and SGLang, offering flexible deployment options including Docker. AI

IMPACT This release offers improved long-document parsing capabilities, potentially benefiting industries dealing with extensive textual data.

RANK_REASON Model release from a significant AI lab (Baidu) with a specific name and capability. [lever_c_demoted from frontier_release: ic=2 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

Baidu releases Unlimited OCR for advanced long-document parsing

COVERAGE [3]

  1. Hugging Face Trending Models TIER_1 (ET) · baidu ·

    baidu/Unlimited-OCR

    image-text-to-text · 47 downloads · 55 likes

  2. Pandaily TIER_1 English(EN) · [email protected] (Pandaily) ·

    Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

    Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

  3. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Atten

    Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Attention, it compresses memory from linear to constant growth, achieving 93.92 percent on the OmniDocBench benchmark. The 3B…