PereStruct pipeline robustly parses complex historical documents

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-09 04:00

Researchers have developed PereStruct, a new pipeline for parsing complex historical documents, particularly newspapers, which often confound current vision-language models. The system integrates a fine-tuned YOLO architecture for layout analysis with a semantic assembly module that uses TF-IDF, visual embeddings, and geometric constraints to reconstruct articles. PereStruct achieved a state-of-the-art F1 score of 0.904 on block-to-article mapping and significantly outperformed generic vision-language models like Qwen3.6 in fidelity. AI

影响 Establishes a new benchmark for historical document analysis, potentially accelerating archival digitization and research.

排序理由 Academic paper detailing a new method for document parsing. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Maksim Shandybo, Ivan Bespalov, Daniil Yefimov, Marina Kosheleva, Alexander Loukianov · 2026-06-09 04:00

PereStruct：多模态语义组装用于鲁棒的历史文献解析

arXiv:2606.07661v1 Announce Type: new Abstract: Parsing historical documents with complex, non-standard layouts remains a fundamental bottleneck in large-scale archival digitization. Unlike modern typography, historical newspapers exhibit severe physical degradation and highly ir…

报道来源 [1]

PereStruct：多模态语义组装用于鲁棒的历史文献解析

相关实体

相关话题