Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 6h

A Study of the Scale Invariant Signal to Distortion Ratio in Speech Separation with Noisy References

Researchers have investigated the effectiveness of the Scale-Invariant Signal-to-Distortion Ratio (SI-SDR) in speech separation when training data includes noisy references. Their analysis revealed that noise in references can limit achievable SI-SDR and introduce unwanted noise into separated outputs. To mitigate this, they proposed a method to enhance references and augment training data, which showed reduced noise but also potential for processing artifacts. AI

IMPACT Highlights limitations in current speech separation metrics, potentially guiding future research in audio AI.

SI-SDR
WSJ0-2Mix
WHAM!
NISQA.v2
Libri2Mix
Simon Dahl Jepsen
Scale-Invariant Signal-to-Distortion Ratio