New TTS framework synthesizes Tibetan dialects from limited data

By PulseAugur Editorial · [1 sources] · 2026-04-27 04:00

Researchers have developed FMSD-TTS, a novel few-shot text-to-speech system designed to generate speech for the low-resource Tibetan language across its three main dialects: Ü-Tsang, Amdo, and Kham. The system utilizes a speaker-dialect fusion module and a Dialect-Specialized Dynamic Routing Network to accurately capture variations while maintaining speaker identity. Evaluations show FMSD-TTS outperforms existing methods in dialectal expressiveness and speaker similarity, with the synthesized speech validated on a speech-to-speech dialect conversion task. AI

IMPACT Enables creation of synthetic speech for low-resource languages, potentially aiding in dialect preservation and accessibility.

RANK_REASON This is a research paper describing a new text-to-speech system for a low-resource language.

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New TTS framework synthesizes Tibetan dialects from limited data

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Yutong Liu, Ziyue Zhang, Ban Ma-bao, Yuqing Cai, Yongbin Yu, Renzeng Duojie, Xiangxiang Wang, Fan Gao, Cheng Huang, Nyima Tashi · 2026-04-27 04:00

FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for \"U-Tsang, Amdo and Kham Speech Dataset Generation

arXiv:2505.14351v4 Announce Type: replace-cross Abstract: Tibetan is a low-resource language with minimal parallel speech corpora spanning its three major dialects-\"U-Tsang, Amdo, and Kham-limiting progress in speech modeling. To address this issue, we propose FMSD-TTS, a few-sh…

COVERAGE [1]

FMSD-TTS: Few-shot Multi-Speaker Multi-Dialect Text-to-Speech Synthesis for \"U-Tsang, Amdo and Kham Speech Dataset Generation

RELATED ENTITIES

RELATED TOPICS