OmniDocBench
PulseAugur coverage of OmniDocBench — every cluster mentioning OmniDocBench across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
Baidu releases Unlimited OCR, challenging long-context AI memory mechanisms · 1 source tracked
Baidu has open-sourced a new OCR model called Unlimited OCR, which excels at processing long documents by mimicking human reading habits. Unlike traditional OCR systems that process documents page by page and then stitc…
-
Open-source OCR models and benchmarks consolidated on Papers with Code
A new resource has been created to track open-source optical character recognition (OCR) models, consolidating information on top-performing models, benchmarks, and links to their papers and code. This initiative highli…
-
Baidu releases Unlimited OCR with constant KV cache for long documents
Baidu has released Unlimited OCR, a 3-billion-parameter Mixture-of-Experts model designed for efficient long-document parsing. The model utilizes Reference Sliding Window Attention (R-SWA) to maintain a constant KV cach…
-
ABot-OCR model transcribes pages directly to Markdown
Researchers have introduced ABot-OCR, a novel end-to-end vision-language model designed for direct transcription of page images into Markdown. This approach bypasses the need for complex modular systems by processing th…
-
New method enhances VLM document layout understanding
Researchers have developed a new method to improve how Vision-Language Models (VLMs) understand document layouts, particularly for documents with structures not seen during training. The approach pre-resolves layout inf…
-
New PureDocBench benchmark reveals document parsing is far from solved
Researchers have introduced PureDocBench, a new benchmark for document parsing that addresses issues with the existing OmniDocBench dataset, which suffers from annotation errors and potential contamination. PureDocBench…
-
RTPrune boosts DeepSeek-OCR inference speed by 1.23x with novel token pruning
Researchers have developed RTPrune, a novel two-stage token pruning method designed to enhance the efficiency of DeepSeek-OCR inference. This method mimics the model's two-stage reading process, first prioritizing high-…