PulseAugur
EN
LIVE 14:26:48

Baidu releases Unlimited OCR for advanced long-document parsing

Baidu has released Unlimited OCR, a new model designed for advanced document parsing. This model leverages a constant KV cache mechanism to achieve state-of-the-art performance, particularly on long documents. It is available on Hugging Face and integrates with popular libraries like Transformers and inference providers such as vLLM and SGLang, offering flexible deployment options including Docker. AI

IMPACT This release offers improved long-document parsing capabilities, potentially benefiting industries dealing with extensive textual data.

RANK_REASON Model release from a significant AI lab (Baidu) with a specific name and capability. [lever_c_demoted from frontier_release: ic=2 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 5 sources. How we write summaries →

Baidu releases Unlimited OCR for advanced long-document parsing

COVERAGE [5]

  1. Hugging Face Trending Models TIER_1 (ET) · baidu ·

    baidu/Unlimited-OCR

    image-text-to-text · 47 downloads · 55 likes

  2. Pandaily TIER_1 English(EN) · [email protected] (Pandaily) ·

    Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

    Baidu Unveils Unlimited-OCR: Constant KV Cache Delivers SOTA Performance on Long Documents

  3. Mastodon — fosstodon.org TIER_1 中文(ZH) · [email protected] ·

    🌘 GitHub - baidu/Unlimited-OCR: The Era of Unlimited OCR: Embracing the Revolution of Single-Pass Long-View Analysis ➤ Building a High-Performance, Long-Text Industrial-Grade OCR Analysis Solution ✤ https://github.com/baidu/Unlimited-OCR Baidu has open-sourced the "Unlimited-OCR" project, aiming to further push the boundaries of document analysis technology

    🌘 GitHub - baidu/Unlimited-OCR:無限 OCR 時代:迎接單次長視野解析的革命 ➤ 打造高效能、長文本的工業級 OCR 解析方案 ✤ https:// github.com/baidu/Unlimited-OCR 百度開源了「Unlimited-OCR」專案,旨在進一步推進文檔解析技術的邊界。該工具專注於「單次長視野解析」(One-shot Long-horizon Parsing),能夠高效處理單頁與多頁文件的 OCR 需求。該模型不僅支援 Huggingface Transformers 的標準推理,還針對高效能需求提供了…

  4. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    🚀 # GitHub and # Baidu introduce "Unlimited OCR: One-Shot Long-Horizon Parsing," proving that even # AI can get lost in its own overcomplicated jargon maze. 🙄 W

    🚀 # GitHub and # Baidu introduce "Unlimited OCR: One-Shot Long-Horizon Parsing," proving that even # AI can get lost in its own overcomplicated jargon maze. 🙄 With promises of "direct agents" and "automate any workflow," it's like they've discovered the fax machine of the digital…

  5. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Atten

    Baidu has unveiled Unlimited-OCR, a new model that solves a fundamental bottleneck in long-document transcription. By introducing Reference Sliding Window Attention, it compresses memory from linear to constant growth, achieving 93.92 percent on the OmniDocBench benchmark. The 3B…