New attack targets ASR by manipulating speech features, not waveforms

By PulseAugur Editorial · [1 sources] · 2026-06-06 04:00

Researchers have developed a new adversarial attack method for automatic speech recognition (ASR) systems that operates in the feature space rather than directly on audio waveforms. This approach, termed the Clean-Referenced Feature-Vocoder Attack, aims to improve transferability to black-box ASR models and bypass defenses targeting waveform perturbations. By manipulating self-supervised learning representations and reconstructing them via a vocoder, the attack achieved a significant increase in Word Error Rate (WER) on various ASR models, highlighting a vulnerability in current robustness evaluations. AI

IMPACT This research reveals a new vulnerability in ASR systems, potentially impacting the security and reliability of speech-to-text technologies.

RANK_REASON The cluster contains an academic paper detailing a new method for adversarial attacks on ASR systems. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Yifan Liao, Zongmin Zhang, Zhen Sun, Yuhui Sun, Xinhu Zheng, Xinlei He · 2026-06-06 04:00

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

arXiv:2606.05678v1 Announce Type: cross Abstract: Automatic speech recognition (ASR) systems have become widely used for multilingual speech-to-text transcription. Their robustness to adversarial attacks has become an important topic for the community. Existing adversarial attack…

COVERAGE [1]

Beyond Waveform Robustness: Robust Feature-Vocoder Adversarial Attacks on Automatic Speech Recognition

RELATED ENTITIES

RELATED TOPICS