The Distillation Game: Adaptive Attacks & Efficient Defenses
Researchers have developed a new framework called "The Distillation Game" to study the trade-off between model utility and imitation risk. This framework models the interaction as a minimax game between a teacher model and an adaptive student model. The study introduces an adaptive evaluation rule and a defense template, leading to a Product-of-Experts (PoE) defense that combines the teacher with a proxy student. AI
IMPACT This research highlights that strong distillation attacks remain a significant challenge, suggesting that defenses should be evaluated against adaptive student models rather than passive ones.