Researchers have introduced the Urdu Katib Handwritten Dataset (UKHD), the first dataset specifically designed for offline Urdu handwritten text recognition from historical documents. This dataset captures variations in Nastalique calligraphy written by historical scribes. To establish a baseline, the study evaluated several Convolutional Recurrent Neural Network (CRNN) based models, finding that the CNN-BGRU-CTC architecture performed best with the lowest Character Error Rate (CER) and Word Error Rate (WER). The goal is to foster further research in recognizing and preserving Urdu handwritten literature. AI
IMPACT This new dataset and baseline evaluation could accelerate research into recognizing and preserving historical Urdu documents.
RANK_REASON The item describes a new dataset and baseline evaluation for a specific research task (Urdu handwritten text recognition), published on arXiv. [lever_c_demoted from research: ic=1 ai=1.0]
- arXiv
- CNN-BGRU-CTC
- computer science
- Computer vision and pattern recognition
- Convolutional Recurrent Neural Network
- Nastalique
- Urdu Handwritten Text Recognition
- Urdu Katib Handwritten Dataset
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →