English(EN) Trusted Weights, Treacherous Optimizations? Optimization-Triggered Backdoor Attacks on LLMs

在编译和触发强度中发现新的LLM漏洞

作者 PulseAugur 编辑部 · [5 个来源] · 2026-05-20 02:55

研究人员发现了与部署过程中使用的优化技术相关的大型语言模型（LLM）的新漏洞。一项研究表明，旨在提高效率的编译过程可能被利用来植入隐藏的后门，这些后门在特定的编译条件下触发，绕过标准的安全性检查，并在开源LLM上实现高攻击成功率。另一篇理论论文探讨了，与直觉相反的是，在后门攻击中更强的触发器有时可以在高维环境中帮助防御者，攻击成功率在有限的触发器强度下达到峰值。 AI

影响新研究强调了LLM部署管道中的关键安全漏洞，可能影响AI系统的安全性和可靠性。

排序理由多篇学术论文发表在arXiv上，详细介绍了关于LLM漏洞和后门攻击理论方面的新研究。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。我们如何撰写摘要 →

报道来源 [5]

arXiv cs.AI TIER_1 English(EN) · Yifei Wang, Tianlin Li, Xiaohan Zhang, Yida Yang, Xiaoyu Zhang, Li Pan · 2026-05-22 04:00

可信权重，危险优化？优化触发的大模型后门攻击

arXiv:2605.20641v1 Announce Type: cross Abstract: Inference optimization is a vital technique for deploying LLMs at scale. Compilation is the most widely adopted optimization technique for LLMs. While it assumes semantic equivalence between the original and compiled graphs, we fi…
arXiv cs.LG TIER_1 English(EN) · Aman Saxena, Jan Schuchardt, Yan Scholten, Stephan G\"unnemann · 2026-05-22 04:00

通过差分隐私的对偶视角实现对后门攻击的可证明鲁棒性

arXiv:2605.21780v1 Announce Type: new Abstract: Randomized smoothing is a powerful tool for certifying robustness to adversarial perturbations, including poisoning attacks via randomized training and evasion attacks via randomized inference. Extending these guarantees to backdoor…
arXiv cs.LG TIER_1 English(EN) · Donald Flynn, Hadas Yaron Goldhirsh, Jonathan P. Keating, Inbar Seroussi · 2026-05-22 04:00

当更强的触发器适得其反时：后门攻击的高维理论

arXiv:2605.22481v1 Announce Type: new Abstract: Backdoor poisoning attacks behave counter-intuitively in high dimensions: stronger training triggers can help the defender. We study regularised generalised linear models on Gaussian-mixture data in the proportional regime ($p/n \to…
arXiv cs.LG TIER_1 English(EN) · Inbar Seroussi · 2026-05-21 13:39

更强的触发器适得其反时：后门攻击的高维理论

Backdoor poisoning attacks behave counter-intuitively in high dimensions: stronger training triggers can help the defender. We study regularised generalised linear models on Gaussian-mixture data in the proportional regime ($p/n \to κ$), varying the training trigger strength $α$ …
arXiv cs.AI TIER_1 English(EN) · Li Pan · 2026-05-20 02:55

可信权重，危险优化？优化触发的大模型后门攻击

Inference optimization is a vital technique for deploying LLMs at scale. Compilation is the most widely adopted optimization technique for LLMs. While it assumes semantic equivalence between the original and compiled graphs, we first uncover its numerical side effects can be mali…

报道来源 [5]

可信权重，危险优化？优化触发的大模型后门攻击

通过差分隐私的对偶视角实现对后门攻击的可证明鲁棒性

当更强的触发器适得其反时：后门攻击的高维理论

更强的触发器适得其反时：后门攻击的高维理论

可信权重，危险优化？优化触发的大模型后门攻击

相关实体

相关话题