Certified Robustness to Data Poisoning in Gradient-Based Training
Researchers have developed a novel framework to certify model robustness against data poisoning attacks without altering the training algorithm or model architecture. This method uses convex relaxations to estimate the range of possible parameter updates, thereby bounding the worst-case behavior of models trained on potentially manipulated data. The approach provides guarantees against untargeted, targeted, and backdoor attacks, demonstrating effectiveness across diverse real-world datasets. AI
IMPACT Provides a method to secure ML models against data manipulation, crucial for applications in sensitive domains like healthcare and autonomous driving.