English(EN) Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs

大型语言模型实现 HTTP 流量中与分类无关的 PII 标注

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-08 04:00

研究人员开发了一种利用大型语言模型 (LLMs) 的新颖流程，可自动识别和标注 HTTP 流量中的个人身份信息 (PII)。该方法旨在克服现有系统依赖稀缺的手动标记数据和固定 PII 分类的局限性。基于 LLM 的方法支持与分类无关的标注，允许在不同 PII 定义和领域之间具有灵活性，还可以生成用于评估的合成数据。 AI

影响为 Web 应用程序中的隐私审计提供了一种灵活且数据高效的方法，有可能减少手动标记工作。

排序理由学术论文，提出一种使用大型语言模型在 HTTP 流量中检测 PII 的新方法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Thomas Cory, Axel K\"upper · 2026-05-08 04:00

Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs

arXiv:2605.06305v1 Announce Type: new Abstract: Automated privacy audits of web and mobile applications often analyse outbound HTTP traffic to detect Personally Identifiable Information (PII) leakage. However, existing learning-based detectors typically depend on scarce, manually…

报道来源 [1]

Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs

相关实体

相关话题