English(EN) The Platonic Defense: Backdoor Defense for Self-Supervised Encoders in the Era of Large Scale Pre-training

新型防御措施应对自监督AI模型的后门攻击

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-30 04:00

研究人员推出了一种名为Platonic Representation Defense的新型防御机制，用于对抗自监督学习（SSL）模型的后门攻击。该方法在黑盒设置下运行，意味着它不需要访问标签、攻击模式或训练数据。该防御措施的灵感来源于柏拉图式表征假说（Platonic Representation Hypothesis），该假说认为独立训练的编码器可以形成兼容的现实投射。通过将此形式化为一个条件能量函数，该系统能够检测和净化表征，在对抗各种攻击方面表现出显著的性能提升。 AI

影响这种防御机制可以增强广泛使用的自监督模型免受恶意操纵的安全性。

排序理由该集群包含一篇详细介绍AI安全新技术的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Tuo Chen, Minjing Dong, Benlei Cui, Jian Liu, Jie Gui · 2026-06-30 04:00

柏拉图式防御：大规模预训练时代自监督编码器的后门防御

arXiv:2606.29451v1 Announce Type: new Abstract: Self-supervised learning (SSL) pretrained models have become a dominant paradigm for visual representation learning, but they are vulnerable to backdoor attacks. Existing defenses struggle to defend against such attacks in a fully b…

报道来源 [1]

柏拉图式防御：大规模预训练时代自监督编码器的后门防御

相关实体

相关话题