ICML 2026 | University of Electronic Science and Technology of China: Tree-based Self-Play TSP, Fine-grained Self-Correction Framework for Secure Code Large Models
Researchers have developed a novel framework called Tree Self-Play (TSP) to address the inherent security vulnerabilities in large language models trained on code. Current methods like supervised fine-tuning and reinforcement learning are too coarse-grained to fix localized coding errors that lead to issues such as SQL injection. TSP introduces a fine-grained, self-driven approach that precisely identifies risk nodes in code and uses self-play to generate both safe and vulnerable code paths for targeted optimization. AI
IMPACT This framework could significantly improve the security of AI-generated code, reducing vulnerabilities and enhancing trust in AI-assisted software development.