Researchers have developed a new concolic testing method for Transformer classifiers that uses SHAP estimates to prioritize path predicates based on their influence on the model's predictions. This approach, implemented in Python, makes self-attention semantics compatible with satisfiability modulo theories solvers. Evaluations on CIFAR-10 with compact Transformer models, ResNet18, and VGG16 demonstrated a 60% success rate in finding adversarial examples within a one-pixel budget and 900-second horizon, significantly outperforming a black-box differential evolution baseline. AI
IMPACT Enhances practical methods for finding adversarial examples in Transformer models, improving AI safety research.
RANK_REASON The cluster contains a research paper detailing a new method for testing neural network robustness. [lever_c_demoted from research: ic=1 ai=1.0]
- arXiv
- Chih-Duo Hong
- CIFAR-10
- Hugging Face
- Python
- ResNet18
- satisfiability modulo theories
- Shap
- transformer
- Vgg16
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →