New PARASITE technique hijacks LLMs via conditional system prompt poisoning

By PulseAugur Editorial · [1 sources] · 2026-04-28 04:00

Researchers have developed a new framework called PARASITE that can conditionally poison system prompts for large language models. This method allows adversaries to create prompts that appear benign but trigger compromised responses for specific queries, such as political questions, while maintaining normal functionality for other inputs. PARASITE operates in a black-box setting and has demonstrated effectiveness against models like GPT-4o-mini and GPT-3.5, evading common defenses. AI

IMPACT Introduces a novel attack vector for LLMs, highlighting potential supply-chain vulnerabilities in prompt marketplaces.

RANK_REASON The cluster contains an academic paper detailing a new method for attacking LLMs.

Read on arXiv cs.CL →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CL TIER_1 English(EN) · Viet Pham, Thai Le · 2026-04-28 04:00

PARASITE: Conditional System Prompt Poisoning to Hijack LLMs

arXiv:2505.16888v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly deployed via third-party system prompts downloaded from public marketplaces. We identify a critical supply-chain vulnerability: conditional system prompt poisoning, where an ad…

COVERAGE [1]

PARASITE: Conditional System Prompt Poisoning to Hijack LLMs

RELATED ENTITIES

RELATED TOPICS