Can Subgraph Explanations Be Weaponized to Steal Graph Neural Networks?
Researchers have developed a new method to extract information from graph neural networks (GNNs) by exploiting their explainability interfaces. This attack, operating under strict black-box constraints, uses explanation outputs to estimate edge sensitivity and efficiently search decision boundaries. Experiments show this method is superior to existing baselines, highlighting potential security vulnerabilities in GMLaaS platforms and informing the development of defensive strategies and AI policy. AI
IMPACT Highlights security risks in explainable AI for graph models, potentially influencing future AI safety research and regulatory approaches.