Audio deepfake model explanations found to be fragile

By PulseAugur Editorial · [2 sources] · 2026-06-12 13:58

Researchers have demonstrated that explanations for audio deepfake detection models can be manipulated. By introducing imperceptible perturbations, an adversary can alter the model's attribution heatmaps without changing the final prediction of whether an audio clip is a deepfake. This vulnerability was tested across various state-of-the-art architectures, highlighting a potential weakness in current explainability methods for audio analysis. AI

IMPACT Reveals a vulnerability in AI model explanations, potentially impacting trust and security in audio deepfake detection systems.

RANK_REASON The cluster contains an academic paper detailing research findings on AI model explainability.

Read on arXiv cs.AI →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Piotr Kit{\l}owski, Dominik Wi\k{a}cek, Mateusz Modrzejewski · 2026-06-15 04:00

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

arXiv:2606.14466v1 Announce Type: cross Abstract: This paper investigates the fragility of post-hoc explanation methods in audio deepfake detection. While previous work on explanation manipulation focused on images using standard $L_p$ metrics, we introduce a psychoacoustic frame…
arXiv cs.AI TIER_1 English(EN) · Mateusz Modrzejewski · 2026-06-12 13:58

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

This paper investigates the fragility of post-hoc explanation methods in audio deepfake detection. While previous work on explanation manipulation focused on images using standard $L_p$ metrics, we introduce a psychoacoustic framework that optimizes inaudible perturbations to dec…

COVERAGE [2]

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

The Perceived Fragility of Explanations in Audio Models: Manipulation of Attribution with Unchanged Predictions

RELATED ENTITIES

RELATED TOPICS