PulseAugur / Brief
EN
LIVE 12:39:02

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. A Text Recognition Dataset from Sahidic Coptic Ancient Manuscripts

    Researchers have introduced SCAM, a new dataset designed for Handwritten Text Recognition (HTR) of Sahidic Coptic ancient manuscripts. This dataset addresses the challenges of low-resource languages, rare scripts, and degraded historical documents, combining heterogeneous acquisition conditions with typical manuscript degradations like ink fading and material deterioration. Benchmarking current state-of-the-art HTR approaches on SCAM highlights their limitations in low-resource, historically grounded scenarios, providing a benchmark for future developments in the field. AI

    IMPACT This dataset could advance research in low-resource HTR, potentially improving AI's ability to process historical and underrepresented languages.