ArtNet: A JEPA-Like Articulatory Predictive Framework for Robust Zero-Shot Phoneme Recognition
Researchers have developed ArtNet, a new framework designed to improve zero-shot phoneme recognition across different languages. By leveraging articulatory features and a variational information bottleneck, ArtNet aims to create more robust acoustic-to-symbol mappings that are less susceptible to language-specific variations. Experiments show that ArtNet, especially when combined with a vector-space inventory alignment strategy, significantly reduces phoneme error rates compared to existing methods. AI