PulseAugur / Brief
EN
LIVE 11:12:48

Brief

last 24h
[1/1] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. PereStruct: Multimodal Semantic Assembly for Robust Historical Document Parsing

    Researchers have developed PereStruct, a new pipeline for parsing complex historical documents, particularly newspapers, which often confound current vision-language models. The system integrates a fine-tuned YOLO architecture for layout analysis with a semantic assembly module that uses TF-IDF, visual embeddings, and geometric constraints to reconstruct articles. PereStruct achieved a state-of-the-art F1 score of 0.904 on block-to-article mapping and significantly outperformed generic vision-language models like Qwen3.6 in fidelity. AI

    IMPACT Establishes a new benchmark for historical document analysis, potentially accelerating archival digitization and research.