Researchers have introduced Uncertainty-Based Debiasing and Unlearning (UBD), a novel framework for evaluating and mitigating data contamination in large language models (LLMs). Unlike previous methods that rely solely on aggregate accuracy, UBD employs a sample-level evaluation using distributional distance metrics. This approach leverages deep ensembles of the contaminated model to estimate per-sample memorization and uses ensemble uncertainty to construct a debiased target distribution. Experiments on MMLU-Pro and MATH-MCQA benchmarks show that UBD effectively reduces inflated performance metrics caused by contamination, while preserving model performance on uncontaminated data. AI
IMPACT Provides a more robust method for evaluating LLM performance by addressing data contamination, leading to more reliable benchmarks.
RANK_REASON Academic paper introducing a new methodology for LLM evaluation. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →