PulseAugur
EN
LIVE 01:25:28

Arabic ASR model training stalls, user seeks community help

A user on Reddit is seeking help with an Arabic Automatic Speech Recognition (ASR) model that is failing to converge during training. The model, based on a SpeechBrain Conformer-Transformer architecture, uses a combination of CTC and KL divergence loss functions. Despite significant drops in both losses early on, they quickly plateau, resulting in a high Word Error Rate (WER) on validation sets. The user has attempted various adjustments to learning rate, batch size, and vocabulary size without success, and is looking for potential causes or solutions from the community. AI

IMPACT This discussion highlights common challenges in training specialized ASR models, potentially offering insights for other researchers working with similar architectures or languages.

RANK_REASON User is asking for help with a specific machine learning model training issue, which falls under research-level discussion. [lever_c_demoted from research: ic=1 ai=1.0]

Read on r/MachineLearning →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. r/MachineLearning TIER_1 English(EN) · /u/Sweet-Hamster-4991 ·

    Arabic ASR model struggling to converge during training [D]

    <!-- SC_OFF --><div class="md"><p>i'm trying to train an ASR model using the <a href="https://github.com/speechbrain/speechbrain/blob/develop/recipes/LibriSpeech/ASR/transformer/train.py">LibriSpeech recipe from SpeechBrain</a> (without the language model) on a 100-hour dataset o…