New benchmarks and models advance document parsing and table extraction
ByPulseAugur Editorial·[6 sources]·
Researchers have introduced new benchmarks and improved models for document parsing and table extraction. Dr. DocBench focuses on expert-level document parsing, including complex structures like chemical formulas and music notation, highlighting current model limitations. DTBench offers a synthetic benchmark for document-to-table extraction, evaluating LLMs on reasoning and conflict resolution. Additionally, PaddleOCR-VL-1.6 has been enhanced with region-aware optimization and progressive post-training, achieving state-of-the-art results on OmniDocBench v1.6.
AI
IMPACT
Advances in document and table extraction benchmarks and models will improve AI's ability to process and analyze complex documents and data.
RANK_REASON
Multiple research papers introducing new benchmarks and model improvements for document parsing and table extraction.
arXiv:2606.01393v1 Announce Type: cross Abstract: Document parsing and recognition are fundamental capabilities for vision-language models (VLMs) and document processing systems. However, existing Optical Character Recognition (OCR) and document parsing benchmarks are increasingl…
arXiv:2603.18652v2 Announce Type: replace-cross Abstract: Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing evaluation approaches rely on rule-based metrics that fail to capture semantic equivale…
PaddleOCR-VL-1.6 enhances document parsing performance through targeted data optimization and progressive post-training techniques, achieving state-of-the-art results on OmniDocBench v1.6.
arXiv cs.AI
TIER_1English(EN)·Yuxiang Guo, Zhuoran Du, Nan Tang, Kezheng Tang, Congcong Ge, Yunjun Gao·
arXiv:2602.13812v3 Announce Type: replace-cross Abstract: Document-to-table (Doc2Table) extraction derives structured tables from unstructured documents under a target schema, enabling reliable and verifiable SQL-based data analytics. Although large language models (LLMs) have sh…
arXiv:2606.03264v1 Announce Type: new Abstract: We introduce PaddleOCR-VL-1.6, an upgraded compact document parsing model built upon PaddleOCR-VL-1.5. Although PaddleOCR-VL-1.5 establishes a strong 0.9B baseline, its remaining errors concentrate in under-optimized regions where m…
arXiv:2512.10888v3 Announce Type: replace Abstract: Table extraction (TE) is a key challenge in document understanding. Traditional approaches detect tables first, then recognize their structure. Recently, interest has surged in developing methods, such as vision-language models …