Text region detection in historical astronomical diagrams
Researchers have introduced a new dataset for text detection in historical astronomical diagrams, addressing a gap in document analysis. The dataset, comprising 948 diagrams from the 8th to 18th centuries, features over 10,000 annotated text regions with precise polygonal delineations and reading direction encoding. Several baseline models were evaluated, with Poly-DETR, an extension of DINO-DETR, showing strong performance on existing benchmarks and serving as a solid baseline for this new dataset. The dataset and code are publicly available. AI