Researchers have developed a winning system for the KSAA-2026 Shared Task on Arabic Speech Dictation with Automatic Diacritization. The system, named Thaka, fine-tunes a CATT-Whisper multimodal model using a limited dataset of 2,327 samples. Key to its success were training regularization techniques, including R-Drop consistency regularization, optimized hyperparameters, and Focal Loss, along with averaging 200 stochastic forward passes from four model checkpoints during inference. This approach resulted in a Word Error Rate (WER) of 23.26%, securing first place among participants. AI
IMPACT Demonstrates advanced fine-tuning techniques for low-resource speech diacritization tasks.
RANK_REASON The cluster contains a research paper detailing a winning system for a specific task in automatic speech recognition and diacritization.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →