RedactionBench
Researchers have introduced RedactionBench, a new benchmark designed to evaluate how well large language models can redact personally identifiable information (PII) while considering contextual privacy. The benchmark includes 200 diverse documents and a novel R-Score metric that accounts for semantic similarity in redactions. Evaluations show that current models, including frontier models with agentic tools, struggle with contextual redaction, and human annotators also exhibit significant disagreement on what constitutes a contextual redaction. AI
IMPACT Highlights a critical gap in LLM capabilities for sensitive data handling, potentially influencing future model development and evaluation standards for privacy-preserving AI.