PulseAugur
EN
LIVE 07:10:12

New dataset and CRNN model advance Urdu handwritten text recognition

Researchers have introduced the Urdu Katib Handwritten Dataset (UKHD), the first offline dataset of historical Urdu handwritten text lines. This dataset aims to address the scarcity of resources for Urdu Handwritten Text Recognition (UHTR). The study also evaluated various CRNN-based models, identifying CNN-BGRU-CTC as the most effective for Urdu Katib Handwriting Recognition, achieving low character and word error rates. AI

IMPACT This dataset and model evaluation could spur further development in recognizing historical Urdu script, aiding in the preservation of cultural heritage.

RANK_REASON The cluster describes a new academic dataset and evaluation of models for a specific recognition task, fitting the research bucket.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Ramza Basharat, Muhammad Usman Ali ·

    Urdu Katib Handwritten Dataset: A Historical Document Dataset for Offline Urdu Handwritten Text Recognition with CRNN-Based Baseline Evaluation

    arXiv:2606.19139v1 Announce Type: cross Abstract: Automatic Handwritten Text Recognition (HTR) is inherently a challenging task, and its complexity is further increased when dealing with cursive scripts. Although significant efforts have been made on various cursive scripts, rese…

  2. arXiv cs.CV TIER_1 English(EN) · Muhammad Usman Ali ·

    Urdu Katib Handwritten Dataset: A Historical Document Dataset for Offline Urdu Handwritten Text Recognition with CRNN-Based Baseline Evaluation

    Automatic Handwritten Text Recognition (HTR) is inherently a challenging task, and its complexity is further increased when dealing with cursive scripts. Although significant efforts have been made on various cursive scripts, research regarding Urdu Handwritten Text Recognition (…