Why Blocking Prompt Injection Is Wrong — and What to Do Instead
Instead of blocking prompt injection attacks, the MIRAGE system uses a honeypot approach to deceive attackers. When a suspicious prompt is detected, MIRAGE feeds the attacker fabricated data and logs their actions, making them believe they are succeeding. This method aims to waste the attacker's resources and collect intelligence on their techniques, rather than alerting them to their detection. AI
IMPACT Offers a novel defensive strategy against prompt injection, potentially reducing the effectiveness of attacks on AI agents.