PIIGuard 通过对抗性碎片保护网页免受 LLM PII 抓取

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-04 20:13

研究人员开发了 PIIGuard，这是一种新颖的网页级防御系统，旨在阻止大型语言模型 (LLM) 抓取个人身份信息 (PII)。该系统在网页中嵌入隐藏的 HTML 碎片，巧妙地引导 LLM 远离泄露敏感数据。PIIGuard 在包括 GPT-5.4-nano、Claude-haiku-4.5 和 DeepSeek-chat 在内的多个 LLM 模型上展示了至少 97.0% 的防御成功率，同时保持了页面在标准问答任务中的可用性。 AI

影响为网站所有者提供了一种新方法，以保护用户数据免受基于 LLM 的抓取。

排序理由学术论文，详细介绍了一种减轻 LLM PII 泄露的新方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Mingshuo Liu, Yiwei Zha, Min Chen · 2026-05-06 04:00

PIIGuard：在对抗性清理下缓解 PII 泄露

arXiv:2605.03129v1 Announce Type: cross Abstract: Browsing-enabled LLM assistants can fetch webpages and answer contact-seeking queries, creating a practical channel for scraping contact-style personally identifiable information (PII) from public pages. Many prior defenses are de…
arXiv cs.CL TIER_1 English(EN) · Min Chen · 2026-05-04 20:13

PIIGuard：在对抗性清理下缓解 PII 泄露

Browsing-enabled LLM assistants can fetch webpages and answer contact-seeking queries, creating a practical channel for scraping contact-style personally identifiable information (PII) from public pages. Many prior defenses are deployed at the model, service, or agent layer rathe…

报道来源 [2]

PIIGuard：在对抗性清理下缓解 PII 泄露

PIIGuard：在对抗性清理下缓解 PII 泄露

相关实体

相关话题