LASE model improves cross-script voice cloning by making embeddings language-uninformative

By PulseAugur Editorial · [2 sources] · 2026-05-01 16:46

Researchers have developed LASE, a Language-Adversarial Speaker Encoder, to improve multilingual voice cloning. Standard encoders struggle to maintain speaker identity across different scripts, particularly when projecting non-Indic voices into Indic languages. LASE utilizes a novel training approach with a supervised contrastive loss and a gradient-reversal cross-entropy objective to create language-uninformative yet speaker-informative embeddings. This method significantly reduces the identity gap across scripts and enhances cross-script speaker recall with substantially less training data. AI

IMPACT Improves cross-script voice cloning accuracy, potentially enabling more seamless multilingual TTS systems.

RANK_REASON The cluster contains an arXiv preprint detailing a new method for speaker encoding in multilingual voice cloning.

Read on arXiv cs.CL →

paper
other

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

LASE model improves cross-script voice cloning by making embeddings language-uninformative

COVERAGE [2]

arXiv cs.CL TIER_1 English(EN) · Venkata Pushpak Teja Menta · 2026-05-04 04:00

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

arXiv:2605.00777v1 Announce Type: cross Abstract: A speaker encoder used in multilingual voice cloning should treat the same speaker identically regardless of which script the audio was uttered in. Off-the-shelf encoders do not, and the failure is accent-conditional. On a 1043-pa…
arXiv cs.CL TIER_1 English(EN) · Venkata Pushpak Teja Menta · 2026-05-01 16:46

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

A speaker encoder used in multilingual voice cloning should treat the same speaker identically regardless of which script the audio was uttered in. Off-the-shelf encoders do not, and the failure is accent-conditional. On a 1043-pair Western-accented voice corpus across English, H…

COVERAGE [2]

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation

RELATED ENTITIES

RELATED TOPICS