CASPIAN: Online Detection and Attribution of Cascade Attacks in LLM Multi-Agent Systems via Cross-Channel Causal Monitoring
Researchers have developed CASPIAN, a novel framework designed to detect and attribute cascade attacks within multi-agent systems powered by large language models (LLMs). These attacks involve adversarial influence spreading across agents, leading to system-wide failures that are difficult to identify due to their distributed and interconnected nature. CASPIAN utilizes a cross-channel causal analysis by modeling agent interactions with a dynamic causal influence matrix, estimated through a late-interaction conditional transfer entropy formulation. This approach allows for the identification of the attack's origin, bridge, and amplifier agents, as well as its propagation pathways, outperforming existing defenses in accuracy and early detection with minimal latency overhead. AI
IMPACT This research introduces a new method for securing LLM-based multi-agent systems against sophisticated cascade attacks, potentially improving the reliability of AI agents in complex interactions.