Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape
This paper investigates the relationship between classification robustness and explanation robustness in image classification models. The authors propose a new training method and evaluation approach using clustering to analyze explanation robustness. Their findings suggest that improving explanation robustness does not necessarily enhance classification robustness, challenging a common assumption in the field. AI
IMPACT Challenges assumptions about AI model robustness, potentially guiding future research into more reliable AI systems.