设备端PII替换管道使用区域设置提示来修复复述问题

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-13 13:47

研究人员开发了一种设备端管道，用于将个人身份信息（PII）替换为一致的、保留类型的虚假值，旨在保持下游文本的效用。该系统使用小型语言模型（SLM）生成替代值，但最初遇到了演示复述问题。引入了一种新颖的区域设置条件轮换少样本提示技术来解决此问题，从而成功实现了跨多个区域设置的PII替换。然而，研究发现，虽然SLM替代值产生的文本更自然，但会导致训练数据多样性降低，与更简单的方法相比，对下游命名实体识别（NER）性能产生负面影响。 AI

影响这项研究提供了一种在保留文本效用的同时改进设备端PII处理的方法，但也强调了对下游NER任务产生影响的权衡。

排序理由该集群描述了一篇研究论文，其中详细介绍了一种使用小型语言模型和特定提示技术进行PII替换的新颖方法。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Deepak Kumar · 2026-05-13 13:47

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-13 13:47

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline…

报道来源 [2]

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

相关实体

相关话题