English(EN) Hiding in Plain Floats: Steganographic Carriers for Indirect Prompt and Content Injection

新的LLM隐写术方法绕过文本、激活防御

作者 PulseAugur 编辑部 · [5 个来源] · 2026-06-07 01:41

研究人员发现了一种在大型语言模型（LLM）中嵌入隐藏消息的新颖方法，该方法可以绕过传统的基于文本的安全措施。一种技术涉及将有效载荷作为结构化浮点参数进行传输，即使存在文本分类器也能逃避检测。另一种方法利用LLM推理中使用的伪随机数生成器，将消息嵌入到种子中，从而仅凭生成的文本就可以重建秘密。此外，一项研究表明，即使是旨在检测这些隐藏消息的复杂的内部激活探测也可以被规避，尽管特定的数据级干预可以恢复可检测性。 AI

影响揭示了LLM安全的新攻击向量，并强调需要超越简单文本分析的更强大的检测机制。

排序理由多篇研究论文详细介绍了LLM内的隐写术新方法及其防御措施。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 5 个来源。我们如何撰写摘要 →

报道来源 [5]

arXiv cs.AI TIER_1 English(EN) · Mudit Sinha, Sanika Chavan · 2026-06-09 04:00

隐藏在普通浮点数中：用于间接提示和内容注入的隐写载体

arXiv:2606.08403v1 Announce Type: cross Abstract: Text-centered prompt-injection defenses assume that the malicious signal is visible in one of the inspected text views. We study a reproducible LLM01-style indirect prompt/content-injection failure mode where that assumption break…
arXiv cs.AI TIER_1 English(EN) · Felix M\"achtle, Jonas Sander, Sebastian Berndt, Ben Weimar, Nils Loose, Thomas Eisenbarth · 2026-06-09 04:00

无需修改的隐写术：通过LLM种子进行隐藏通信

arXiv:2606.09135v1 Announce Type: cross Abstract: We demonstrate that widely deployed Large Language Model (LLM) inference stacks harbor a steganographic channel that requires no modification to model weights, sampling code, or output distributions. The channel exploits a structu…
arXiv cs.LG TIER_1 English(EN) · Charles Westphal, Timothy Douglas, Keivan Navaie, Tiago Pimentel, Fernando E. Rosas · 2026-06-09 04:00

你现在（仍然）能看见我：检测大型语言模型中规避性隐写术载荷

arXiv:2606.09411v1 Announce Type: cross Abstract: Large language models can be fine-tuned to encode prompt-borne secrets into fluent, seemingly benign outputs. This creates a steganographic exfiltration risk that is difficult to detect with output-level steganalysis. Recent work …
arXiv cs.LG TIER_1 English(EN) · Fernando E. Rosas · 2026-06-08 12:27

你现在（仍然）能看见我：检测大型语言模型中逃避型隐写术载荷

Large language models can be fine-tuned to encode prompt-borne secrets into fluent, seemingly benign outputs. This creates a steganographic exfiltration risk that is difficult to detect with output-level steganalysis. Recent work proposes mechanistic detection using linear probes…
arXiv cs.AI TIER_1 English(EN) · Sanika Chavan · 2026-06-07 01:41

隐藏在普通浮点数中：用于间接提示和内容注入的隐写载体

Text-centered prompt-injection defenses assume that the malicious signal is visible in one of the inspected text views. We study a reproducible LLM01-style indirect prompt/content-injection failure mode where that assumption breaks: a payload caught in plain English slips past th…

报道来源 [5]

隐藏在普通浮点数中：用于间接提示和内容注入的隐写载体

无需修改的隐写术：通过LLM种子进行隐藏通信

你现在（仍然）能看见我：检测大型语言模型中规避性隐写术载荷

你现在（仍然）能看见我：检测大型语言模型中逃避型隐写术载荷

隐藏在普通浮点数中：用于间接提示和内容注入的隐写载体

相关实体

相关话题