PulseAugur
EN
LIVE 10:36:30

New technique boosts Vietnamese speech translation accuracy

Researchers have developed a new data augmentation technique called Phonetically-Informed Data Augmentation (PiDA) to improve Vietnamese speech translation. The method addresses error propagation in cascaded speech translation systems by generating ASR-like corruptions based on phonetic confusions. Fine-tuning with PiDA on the FLEURS Vietnamese-English dataset enhanced translation accuracy for erroneous ASR outputs, showing a notable improvement in BLEU scores. AI

IMPACT Improves robustness of speech translation systems to ASR errors, potentially enhancing usability in noisy environments.

RANK_REASON The cluster contains a research paper detailing a new method for speech translation.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Giang Son Nguyen, Tung X. Nguyen, Hieu Minh Truong, Nhu Vo, Wray Buntine, Dung D. Le ·

    PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation

    arXiv:2606.12911v1 Announce Type: new Abstract: Cascaded speech translation (ST) systems suffer from error propagation when Automatic Speech Recognition (ASR) outputs incorrect transcripts. We present the first systematic categorization of ASR errors for Vietnamese ST, classifyin…

  2. arXiv cs.CL TIER_1 English(EN) · Dung D. Le ·

    PiDA: Phonetically-Informed Data Augmentation for Robust Vietnamese Speech Translation

    Cascaded speech translation (ST) systems suffer from error propagation when Automatic Speech Recognition (ASR) outputs incorrect transcripts. We present the first systematic categorization of ASR errors for Vietnamese ST, classifying substitution errors by phonetic cause and quan…