PulseAugur
EN
LIVE 15:57:34

Unlimited OCR model uses new attention to process long documents efficiently

Researchers have developed Unlimited OCR, a new model that addresses the memory and speed limitations of current OCR systems when processing long documents. By replacing standard attention layers with Reference Sliding Window Attention (R-SWA), the model maintains a constant KV cache, allowing it to transcribe dozens of pages in a single forward pass. This approach builds upon the DeepSeek OCR baseline and is also applicable to other sequence-based tasks like ASR and translation. AI

IMPACT Enables more efficient processing of long documents for OCR and other sequence-based AI tasks.

RANK_REASON Publication of a technical report detailing a new model architecture and its application.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Unlimited OCR model uses new attention to process long documents efficiently

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Lei Jia ·

    Unlimited OCR Works

    Recently, end-to-end OCR models, exemplified by DeepSeek OCR, have once again thrust OCR into the spotlight. A widely held view is that employing a large language model (LLM) as the decoder allows the model to leverage the prior distribution of language, leading to improved OCR p…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    Unlimited OCR Works

    Unlimited OCR introduces Reference Sliding Window Attention to eliminate growing memory consumption during long-sequence OCR tasks, enabling efficient transcription of multiple pages in a single forward pass.