Researchers have developed PereStruct, a new pipeline for parsing complex historical documents, particularly newspapers, which often confound current vision-language models. The system integrates a fine-tuned YOLO architecture for layout analysis with a semantic assembly module that uses TF-IDF, visual embeddings, and geometric constraints to reconstruct articles. PereStruct achieved a state-of-the-art F1 score of 0.904 on block-to-article mapping and significantly outperformed generic vision-language models like Qwen3.6 in fidelity. AI
影响 Establishes a new benchmark for historical document analysis, potentially accelerating archival digitization and research.
排序理由 Academic paper detailing a new method for document parsing. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →