PulseAugur
EN
LIVE 17:50:25

New benchmarks and models advance document parsing and table extraction

Researchers have introduced new benchmarks and improved models for document parsing and table extraction. Dr. DocBench focuses on expert-level document parsing, including complex structures like chemical formulas and music notation, highlighting current model limitations. DTBench offers a synthetic benchmark for document-to-table extraction, evaluating LLMs on reasoning and conflict resolution. Additionally, PaddleOCR-VL-1.6 has been enhanced with region-aware optimization and progressive post-training, achieving state-of-the-art results on OmniDocBench v1.6. AI

IMPACT Advances in document and table extraction benchmarks and models will improve AI's ability to process and analyze complex documents and data.

RANK_REASON Multiple research papers introducing new benchmarks and model improvements for document parsing and table extraction.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 6 sources. How we write summaries →

COVERAGE [6]

  1. arXiv cs.AI TIER_1 English(EN) · Minglai Yang, Xinyan Velocity Yu, Pengyuan Li, Xinyu Guo, Zhenting Qi, Konwoo Kim, Longtian Ye, Xiaolong Luo, Jinhe Bi, Henry Zhang, Haris Riaz, Xuan Zhang, Yunze Xiao, Bangya Liu, Tom Tang, Yunfei Zhao, Qunshu Lin, Zihan Wang, Minghao Liu, Michael Lingz… ·

    Dr. DocBench: A Comprehensive Benchmark for Expert-Level and Difficult Document Parsing

    arXiv:2606.01393v1 Announce Type: cross Abstract: Document parsing and recognition are fundamental capabilities for vision-language models (VLMs) and document processing systems. However, existing Optical Character Recognition (OCR) and document parsing benchmarks are increasingl…

  2. arXiv cs.AI TIER_1 English(EN) · Pius Horn, Janis Keuper ·

    Beyond String Matching: Semantic Evaluation of PDF Table Extraction

    arXiv:2603.18652v2 Announce Type: replace-cross Abstract: Reliably extracting tables from PDFs is essential for large-scale scientific data mining and knowledge base construction, yet existing evaluation approaches rely on rule-based metrics that fail to capture semantic equivale…

  3. Hugging Face Daily Papers TIER_1 English(EN) ·

    PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training

    PaddleOCR-VL-1.6 enhances document parsing performance through targeted data optimization and progressive post-training techniques, achieving state-of-the-art results on OmniDocBench v1.6.

  4. arXiv cs.AI TIER_1 English(EN) · Yuxiang Guo, Zhuoran Du, Nan Tang, Kezheng Tang, Congcong Ge, Yunjun Gao ·

    DTBench: A Synthetic Benchmark for Document-to-Table Extraction

    arXiv:2602.13812v3 Announce Type: replace-cross Abstract: Document-to-table (Doc2Table) extraction derives structured tables from unstructured documents under a target schema, enabling reliable and verifiable SQL-based data analytics. Although large language models (LLMs) have sh…

  5. arXiv cs.CV TIER_1 English(EN) · Zelun Zhang, Hongen Liu, Suyin Liang, Yubo Zhang, Yiqing Xiang, Jiaxuan Liu, Ting Sun, Manhui Lin, Yue Zhang, Changda Zhou, Tingquan Gao, Cheng Cui, Yi Liu, Dianhai Yu, Yanjun Ma ·

    PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training

    arXiv:2606.03264v1 Announce Type: new Abstract: We introduce PaddleOCR-VL-1.6, an upgraded compact document parsing model built upon PaddleOCR-VL-1.5. Although PaddleOCR-VL-1.5 establishes a strong 0.9B baseline, its remaining errors concentrate in under-optimized regions where m…

  6. arXiv cs.CV TIER_1 English(EN) · Brandon Smock, Valerie Faucon-Morin, Max Sokolov, Libin Liang, Tayyibah Khanam, Amrit Ramesh, Maury Courtland ·

    PubTables-v2: A new large-scale dataset for full-page and multi-page table extraction

    arXiv:2512.10888v3 Announce Type: replace Abstract: Table extraction (TE) is a key challenge in document understanding. Traditional approaches detect tables first, then recognize their structure. Recently, interest has surged in developing methods, such as vision-language models …