New backdoor attack uses emotion as dynamic trigger for LLMs

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed a novel backdoor attack method called Paraesthesia for large language models, which leverages emotional style as a dynamic trigger. Unlike previous attacks that used static triggers, this method injects emotional cues into the fine-tuning data, causing the model to generate malicious outputs when encountering emotional inputs during inference. The attack reportedly achieves a near 99% success rate across various tasks and models while preserving the model's original utility. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT This research highlights a new vulnerability in LLMs, potentially impacting the security and trustworthiness of AI systems that rely on emotional context.

RANK_REASON Academic paper detailing a new method for attacking LLMs. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

safety
paper

COVERAGE [1]

arXiv cs.CL TIER_1 · Junjiang He · 2026-05-12 06:42

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

Backdoor vulnerabilities widely exist in the fine-tuning of large language models(LLMs). Most backdoor poisoning methods operate mainly at the token level and lack deeper semantic manipulation, which limits stealthiness. In addition, Prior attacks rely on a single fixed trigger t…

COVERAGE [1]

When Emotion Becomes Trigger: Emotion-style dynamic Backdoor Attack Parasitising Large Language Models

RELATED ENTITIES

RELATED TOPICS