Evaluating and Preserving Lexical Stress in English-to-Chinese Speech-to-Speech Translation
Researchers have developed a new method to evaluate and preserve lexical stress in English-to-Chinese speech-to-speech translation (S2ST). They created a stress-annotated Chinese dataset and a Mandarin stress detector using XLS-R, integrating it with the English EmphAssess system to propose a novel objective metric for cross-lingual stress evaluation. A fine-tuned CosyVoice3 system demonstrated improved stress translation capabilities while maintaining translation quality, and the new evaluation metric showed strong correlation with human judgment. AI
IMPACT This research could improve the naturalness and expressiveness of speech-to-speech translation systems.