New benchmark reveals AI agent skills vulnerable to novel attacks

By PulseAugur Editorial · [2 sources] · 2026-06-01 17:45

Researchers have developed SkillHarm, a new benchmark for evaluating security vulnerabilities in AI agent skills. The benchmark includes two attack scenarios: Fixed-Payload Poisoning, where a skill directly compromises a task, and Self-Mutating Poisoning, where a skill alters itself over time. SkillHarm contains 879 attack samples across 71 skills, demonstrating that current agents are vulnerable with success rates up to 86.3%. The study also highlights that many apparent defense successes are due to agents not engaging with poisoned files, indicating current defenses are insufficient. AI

IMPACT Highlights critical security flaws in AI agent skills, potentially impacting the safe deployment of agent-based systems.

RANK_REASON This is a research paper introducing a new benchmark and taxonomy for evaluating AI agent skill security.

Read on arXiv cs.CL →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.CL TIER_1 English(EN) · Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou, Junyi Li, Weitong Ruan, Chentao Ye, Rahul Gupta, Diyi Yang, Yu Su, Huan Sun · 2026-06-02 04:00

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

arXiv:2606.02540v1 Announce Type: new Abstract: Agent skills occupy a privileged position in the agent workflow, as agents are expected to implicitly follow and execute them, rendering third-party skills a vulnerable attack surface. Existing studies have revealed unsafe agent beh…
arXiv cs.CL TIER_1 English(EN) · Huan Sun · 2026-06-01 17:45

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

Agent skills occupy a privileged position in the agent workflow, as agents are expected to implicitly follow and execute them, rendering third-party skills a vulnerable attack surface. Existing studies have revealed unsafe agent behaviors induced by skill-based attacks, but they …

COVERAGE [2]

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

RELATED ENTITIES

RELATED TOPICS