English(EN) MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

新框架MUZZLE发现44种新型攻击

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员开发了MUZZLE，一个旨在测试Web代理针对间接提示注入攻击安全性的自动化框架。该系统能够自适应地识别易受攻击的注入点，并精心设计上下文感知的恶意指令来损害机密性、完整性和可用性。MUZZLE的评估在各种Web应用程序和LLM中发现了大量新型攻击，证明了其在最少人工监督下发现漏洞的有效性。 AI

影响这项研究突显了Web代理中存在的关键安全漏洞，可能影响未来LLM应用程序的开发和安全实践。

排序理由该集群包含一篇详细介绍新研究框架及其发现的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Georgios Syros, Evan Rose, Brian Grinstead, Christoph Kerschbaumer, William Robertson, Cristina Nita-Rotaru, Alina Oprea · 2026-06-16 04:00

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

arXiv:2602.09222v2 Announce Type: replace-cross Abstract: Large language model (LLM) based web agents are increasingly deployed to automate complex online tasks by directly interacting with web sites and performing actions on users' behalf. While these agents offer powerful capab…

报道来源 [1]

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

相关实体

相关话题