Researchers have developed a method to improve the quality of text-to-speech (TTS) for low-resource languages like Khmer and Korean. By fine-tuning the 2.4B-parameter VoxCPM2 model using a single Low-Rank Adaptation (LoRA) adapter, they significantly enhanced the Khmer language's Mean Opinion Score (MOS) from 3.85 to 4.23. This adaptation trained only a small fraction of the model's parameters, demonstrating its efficiency. The technique proved most effective for languages where the base model initially performed poorly, showing no benefit and even degradation for Korean, which the base model already handled well. AI
IMPACT This research demonstrates an efficient method for improving TTS quality in under-resourced languages, potentially broadening access to advanced speech synthesis technologies.
RANK_REASON The cluster contains an academic paper detailing a new research methodology for improving TTS models.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →