English(EN) Uncertainty-based Debiasing and Unlearning for Decontamination

新框架利用不确定性解决大语言模型数据污染问题

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-22 13:26

研究人员推出了一种名为“基于不确定性的去偏与遗忘”（UBD）的新型框架，用于评估和缓解大语言模型（LLMs）中的数据污染。与以往仅依赖聚合准确性的方法不同，UBD采用基于样本的评估，利用分布距离度量。该方法利用受污染模型的深度集成来估计每个样本的记忆情况，并使用集成不确定性来构建一个去偏的目标分布。在MMLU-Pro和MATH-MCQA基准测试上的实验表明，UBD能有效降低由污染引起的性能指标虚高，同时保持模型在未受污染数据上的性能。 AI

影响通过解决数据污染问题，为评估大语言模型性能提供了一种更稳健的方法，从而带来更可靠的基准测试。

排序理由介绍大语言模型评估新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Mark Gales · 2026-06-22 13:26

Uncertainty-based Debiasing and Unlearning for Decontamination

Benchmark-based evaluation is the dominant paradigm for assessing large language model (LLM) capabilities, yet data contamination inflates reported performance and undermines fair comparison. Existing decontamination methods are evaluated solely through aggregate accuracy, which …

报道来源 [1]

Uncertainty-based Debiasing and Unlearning for Decontamination

相关话题