PulseAugur
EN
LIVE 02:20:18

New benchmark reveals AI agent skills vulnerable to novel attacks

Researchers have developed SkillHarm, a new benchmark for evaluating security vulnerabilities in AI agent skills. The benchmark includes two attack scenarios: Fixed-Payload Poisoning, where a skill directly compromises a task, and Self-Mutating Poisoning, where a skill alters itself over time. SkillHarm contains 879 attack samples across 71 skills, demonstrating that current agents are vulnerable with success rates up to 86.3%. The study also highlights that many apparent defense successes are due to agents not engaging with poisoned files, indicating current defenses are insufficient. AI

IMPACT Highlights critical security flaws in AI agent skills, potentially impacting the safe deployment of agent-based systems.

RANK_REASON This is a research paper introducing a new benchmark and taxonomy for evaluating AI agent skill security.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Yuting Ning, Zhehao Zhang, Yash Kumar Lal, Boyu Gou, Junyi Li, Weitong Ruan, Chentao Ye, Rahul Gupta, Diyi Yang, Yu Su, Huan Sun ·

    SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

    arXiv:2606.02540v1 Announce Type: new Abstract: Agent skills occupy a privileged position in the agent workflow, as agents are expected to implicitly follow and execute them, rendering third-party skills a vulnerable attack surface. Existing studies have revealed unsafe agent beh…

  2. arXiv cs.CL TIER_1 English(EN) · Huan Sun ·

    SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction

    Agent skills occupy a privileged position in the agent workflow, as agents are expected to implicitly follow and execute them, rendering third-party skills a vulnerable attack surface. Existing studies have revealed unsafe agent behaviors induced by skill-based attacks, but they …