English(EN) Metric-Dependent Annotation Saturation for Learning from Label Distributions

研究发现，AI模型的标注需求因评估指标而异

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-28 11:46

一篇新研究论文探讨了有效训练AI模型所需的标注者数量如何取决于所使用的具体评估指标。该研究聚焦于自然语言推断（NLI）模型，发现像熵相关性（entropy correlation）这样的指标需要更大的标注者池（20-50人）才能稳定，而像KL散度（KL divergence）这样的分布匹配指标则只需10名标注者即可收敛。这表明标注预算应根据预期的评估指标进行定制，而不是采用统一的方法。 AI

影响建议根据评估指标优化标注预算，以实现更高效的AI模型训练。

排序理由该集群包含一篇详细介绍AI模型训练方法新发现的研究论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Guneet Kohli · 2026-05-29 04:00

Metric-Dependent Annotation Saturation for Learning from Label Distributions

arXiv:2605.29797v1 Announce Type: new Abstract: When annotators disagree on a label, the disagreement itself carries signal -- and the number of annotators needed to capture it depends on the evaluation metric. We fine-tune NLI models on label distributions subsampled from ChaosN…
arXiv cs.CL TIER_1 English(EN) · Guneet Kohli · 2026-05-28 11:46

Metric-Dependent Annotation Saturation for Learning from Label Distributions

When annotators disagree on a label, the disagreement itself carries signal -- and the number of annotators needed to capture it depends on the evaluation metric. We fine-tune NLI models on label distributions subsampled from ChaosNLI, a dataset providing 100 independent annotato…

报道来源 [2]

Metric-Dependent Annotation Saturation for Learning from Label Distributions

Metric-Dependent Annotation Saturation for Learning from Label Distributions

相关实体

相关话题