PIIGuard shields webpages from LLM PII harvesting via adversarial fragments

By PulseAugur Editorial · [2 sources] · 2026-05-04 20:13

Researchers have developed PIIGuard, a novel webpage-level defense system designed to prevent large language models (LLMs) from harvesting personally identifiable information (PII). This system embeds hidden HTML fragments within webpages that subtly redirect LLMs away from disclosing sensitive data. PIIGuard demonstrated a defense success rate of at least 97.0% across several LLM models, including GPT-5.4-nano, Claude-haiku-4.5, and DeepSeek-chat, while maintaining the page's utility for standard question-answering tasks. AI

IMPACT Offers a new method for website owners to protect user data from LLM-based scraping.

RANK_REASON Academic paper detailing a new method for mitigating PII leakage from LLMs.

Read on arXiv cs.CL →

paper
safety

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.CL TIER_1 English(EN) · Mingshuo Liu, Yiwei Zha, Min Chen · 2026-05-06 04:00

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

arXiv:2605.03129v1 Announce Type: cross Abstract: Browsing-enabled LLM assistants can fetch webpages and answer contact-seeking queries, creating a practical channel for scraping contact-style personally identifiable information (PII) from public pages. Many prior defenses are de…
arXiv cs.CL TIER_1 English(EN) · Min Chen · 2026-05-04 20:13

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

Browsing-enabled LLM assistants can fetch webpages and answer contact-seeking queries, creating a practical channel for scraping contact-style personally identifiable information (PII) from public pages. Many prior defenses are deployed at the model, service, or agent layer rathe…

COVERAGE [2]

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

RELATED ENTITIES

RELATED TOPICS