Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 3w

Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

Researchers have developed a novel backdoor attack called Turn-based Structural Triggers (TST) that exploits the dialogue structure of Large Language Models (LLMs) rather than user-visible prompts. This attack uses the turn index within a conversation as the trigger, allowing a backdoored model to execute malicious behaviors at specific points in a dialogue without any discernible input trigger. TST demonstrated a high attack success rate across multiple LLM families while maintaining normal performance on non-triggered tasks, highlighting a new vulnerability in multi-turn conversational AI systems. AI

IMPACT Reveals a new attack vector for LLMs, necessitating the development of structure-aware auditing methods beyond prompt inspection.

Large Language Models
Yiyang Lu
Turn-based Structural Triggers