Researchers have developed Unlimited OCR, a new model that addresses the memory and speed limitations of current OCR systems when processing long documents. By replacing standard attention layers with Reference Sliding Window Attention (R-SWA), the model maintains a constant KV cache, allowing it to transcribe dozens of pages in a single forward pass. This approach builds upon the DeepSeek OCR baseline and is also applicable to other sequence-based tasks like ASR and translation. AI
IMPACT Enables more efficient processing of long documents for OCR and other sequence-based AI tasks.
RANK_REASON Publication of a technical report detailing a new model architecture and its application.
Read on Hugging Face Daily Papers →
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →