New Sparse Backdoor attack hides undetectable compromises in image classifiers

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed a novel supply-chain attack called Sparse Backdoor, capable of embedding a provably undetectable backdoor into pre-trained image classifiers like convolutional networks and Vision Transformers. The method involves injecting a sparse perturbation into fully connected layers, which is then masked by a Gaussian dither. This dither creates a clean reference distribution, making it computationally infeasible to distinguish the backdoored model from the original, even with white-box access to the parameters. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Highlights a new sophisticated attack vector for model supply chains, necessitating enhanced security measures for deployed AI systems.

RANK_REASON Academic paper detailing a new method for embedding undetectable backdoors in image classification models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

paper
safety

COVERAGE [1]

arXiv cs.LG TIER_1 · Sarthak Choudhary, Atharv Singh Patlan, Nils Palumbo, Ashish Hooda, Kassem Fawaz, Somesh Jha · 2026-05-07 04:00

Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions

arXiv:2605.04209v1 Announce Type: cross Abstract: We present Sparse Backdoor, a supply-chain attack that plants a \emph{provably undetectable} backdoor in pre-trained image classifiers, including convolutional networks and Vision Transformers. The attack injects a structured spar…

COVERAGE [1]

Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions

RELATED ENTITIES

RELATED TOPICS