PulseAugur
EN
LIVE 04:40:18

New method detects AI agent memory poisoning with 99% accuracy

Researchers have identified a novel method for detecting memory poisoning attacks on AI agents by analyzing their tool-call trajectories. They discovered a behavioral invariant where successful attacks consistently involve calling `memory_recall_fact` before `email_send_email`, a sequence rarely seen in legitimate sessions. This invariant, when used with a Random Forest classifier, achieves a high detection rate (AUC = 0.9904) and generalizes across various models, including GPT-4.1 and GPT-4o, without retraining. The method can also differentiate memory-channel attacks from prompt-injection attacks using tool-call logs alone. AI

IMPACT This research offers a robust method for securing AI agents against memory poisoning, potentially improving the reliability of AI systems in critical applications.

RANK_REASON The cluster contains a research paper detailing a new method for detecting AI agent memory poisoning.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New method detects AI agent memory poisoning with 99% accuracy

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Jun Wen Leong ·

    Forensic Trajectory Signatures for Agent Memory Poisoning Detection

    arXiv:2606.30566v1 Announce Type: cross Abstract: We discover a behavioral invariant in LLM agents under persistent memory poisoning: in architectures where routing information is retrieved through observable memory-tool invocations, successful attacks require calling memory_reca…

  2. arXiv cs.LG TIER_1 English(EN) · Jun Wen Leong ·

    Forensic Trajectory Signatures for Agent Memory Poisoning Detection

    We discover a behavioral invariant in LLM agents under persistent memory poisoning: in architectures where routing information is retrieved through observable memory-tool invocations, successful attacks require calling memory_recall_fact before email_send_email, a transition that…