Brief · PulseAugur

RESEARCH · arXiv cs.CL English(EN) · 5d · [2 sources]

KIT's Submission to Cross-Lingual Voice Cloning in IWSLT 2026

Researchers from KIT have developed a novel approach for cross-lingual voice cloning, a technique crucial for speech translation. Their method builds upon the FishAudio-S2-Pro multilingual text-to-speech model, incorporating language tag prompting to enhance language control and minimize accent bleed-through. Additionally, they employed reinforcement learning for fine-tuning and introduced a reference-conditioned lexical matching technique to improve the pronunciation of specialized vocabulary. AI

IMPACT This research advances cross-lingual voice cloning, potentially improving the naturalness and intelligibility of translated speech and enabling more seamless multilingual communication systems.

KIT
IWSLT 2026
FishAudio-S2-Pro