MinerU-Popo: Universal Post-Processing Model for Structured Document Parsing
Researchers have developed MinerU-Popo, a novel framework designed to enhance structured document parsing by addressing limitations in current VLM-based OCR models. This system focuses on reconstructing document-level logical structures, such as paragraphs and tables, that are often fragmented across page boundaries. By employing a lightweight post-processing model fine-tuned on a custom dataset and utilizing dynamic chunking for long documents, MinerU-Popo significantly improves accuracy in RAG applications and reduces latency. AI
IMPACT Enhances document understanding for AI systems, potentially improving RAG accuracy and efficiency.