PulseAugur
EN
LIVE 15:45:27

New backdoor attack exploits LLM dialogue structure

Researchers have developed a novel backdoor attack called Turn-based Structural Triggers (TST) that exploits the dialogue structure of Large Language Models (LLMs) rather than user-visible prompts. This attack uses the turn index within a conversation as the trigger, allowing a backdoored model to execute malicious behaviors at specific points in a dialogue without any discernible input trigger. TST demonstrated a high attack success rate across multiple LLM families while maintaining normal performance on non-triggered tasks, highlighting a new vulnerability in multi-turn conversational AI systems. AI

IMPACT Reveals a new attack vector for LLMs, necessitating the development of structure-aware auditing methods beyond prompt inspection.

RANK_REASON The cluster contains an academic paper detailing a new method for attacking LLMs. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Yiyang Lu, Jinwen He, Yue Zhao, Kai Chen, Ruigang Liang, Cheng Hong, Yingjun Zhang ·

    Turn-Based Structural Triggers: Prompt-Free Backdoors in Multi-Turn LLMs

    arXiv:2601.14340v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely integrated into interactive systems such as dialogue agents and task-oriented assistants. This growing ecosystem also raises supply-chain risks, where adversaries can distribute pois…