BackWeak: Backdooring Knowledge Distillation Simply with Weak Triggers and Fine-tuning
Researchers have developed a new method called BackWeak to implant backdoors into knowledge distillation processes. This technique uses subtle, imperceptible triggers and simple fine-tuning of teacher models. The backdoor reliably transfers to various student architectures during standard distillation, achieving high success rates with greater stealth than previous methods. AI
IMPACT Highlights a new vulnerability in AI model compression, potentially impacting the security of deployed AI systems.