English(EN) Why Does Agentic Safety Fail to Generalize Across Tasks?

研究发现：AI智能体安全泛化能力在不同任务间失效

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-07 22:16

一篇新的研究论文探讨了AI智能体在泛化到新任务时为何难以保持安全性。研究表明，这种困难源于任务与其安全执行之间的内在复杂性关系，而不仅仅是训练限制。在模拟四旋翼飞行器和CRM中的LLM进行的实验表明，当前的安全方法可能不足，需要新的方法。 AI

影响强调了AI安全领域的一个基本挑战，表明当前方法不足，需要新的方法来实现可靠的智能体行为。

排序理由在arXiv上发表的学术论文，详细介绍了关于AI安全泛化能力的理论和实证研究结果。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv stat.ML TIER_1 English(EN) · Yonatan Slutzky, Yotam Alexander, Tomer Slor, Yoav Nagel, Nadav Cohen · 2026-05-11 04:00

Why Does Agentic Safety Fail to Generalize Across Tasks?

arXiv:2605.06992v1 Announce Type: cross Abstract: AI agents are increasingly deployed in multi-task settings, where the task to perform is specified at test time, and the agent must generalize to unseen tasks. A major concern in such settings is safety: often, an agent must not o…
arXiv stat.ML TIER_1 English(EN) · Nadav Cohen · 2026-05-07 22:16

Why Does Agentic Safety Fail to Generalize Across Tasks?

AI agents are increasingly deployed in multi-task settings, where the task to perform is specified at test time, and the agent must generalize to unseen tasks. A major concern in such settings is safety: often, an agent must not only execute unseen tasks, but do so while avoiding…