None WMAttack: Automated Attack Search for Adversarial Evaluation of World-Model Agents

新框架可自动搜索世界模型代理的对抗性攻击

作者 PulseAugur 编辑部 · [1 source] · 2026-05-25 04:00

研究人员开发了 WMAttack，一个新颖的自动化框架，旨在严格评估世界模型代理的对抗鲁棒性。该系统解决了在不过度估计代理韧性的情况下有效寻找攻击的挑战。WMAttack 采用自我纠正攻击搜索 (SCAS) 和表示引导攻击检索 (RGAR) 等技术，以发现更强的攻击并提高各种任务的搜索效率。 AI

影响这项研究引入了一种评估人工智能代理对抗鲁棒性的新颖方法，有望带来更安全可靠的决策系统。

排序理由该集群包含一篇学术论文，详细介绍了一种评估人工智能代理的新方法。 [lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 · Zhixiang Guo, Siyuan Liang, Shi Fu, Cheng Guo, Andras Balogh, Mark Jelasity, Dacheng Tao · 2026-05-25 04:00

WMAttack: Automated Attack Search for Adversarial Evaluation of World-Model Agents

arXiv:2605.23220v1 Announce Type: new Abstract: Despite the growing use of world models as decision-making agents, their adversarial robustness remains underexplored due to the lack of dedicated automated evaluation methods. A key obstacle is that attack evaluation must be both a…