BadWorld: Adversarial Attacks on World Models
Researchers have developed BadWorld, a novel adversarial framework designed to expose vulnerabilities in visual world models (VWMs). This label-free system generates subtle perturbations in images that lead to catastrophic failures in the model's future predictions, even when faced with unseen user controls. The findings highlight significant risks for deploying VWMs in safety-critical applications and suggest potential privacy protection mechanisms. AI
IMPACT Highlights critical risks for deploying visual world models in safety-critical systems.