Brief · PulseAugur

COMMENTARY · r/MachineLearning English(EN) · 7h

Best architecture for seamless Bilingual TTS? (Azure / English + Korean) [D]

A user is seeking the optimal architecture for a bilingual Text-to-Speech system that seamlessly integrates English and Korean within a single sentence. They are encountering issues with Azure Cognitive Services, where using a multilingual voice results in an unnatural Korean accent, and switching between separate English and Korean voices introduces disruptive pauses. The user is exploring potential SSML workarounds, alternative Azure OpenAI voices, or entirely different solutions to achieve native-sounding pronunciation for their language learning application. AI

IMPACT Developers can learn about challenges and potential solutions for implementing bilingual text-to-speech in applications.

Python
English
Azure OpenAI
Korean
React Native
SSML
Azure Cognitive Services