Apple ML Research: Annotation needs vary by evaluation metric

By PulseAugur Editorial · [1 sources] · 2026-06-23 00:00

Apple Machine Learning Research has published a paper detailing a method called Metric-Dependent Annotation Saturation. This approach suggests that the number of annotators required to capture meaningful signal from label distributions is dependent on the specific evaluation metric being used. For instance, achieving convergence for entropy correlation in NLI models requires significantly more annotators than for distributional match. The research also highlights that soft labels, which represent nuanced decision boundaries, offer better regularization and generalization than one-hot labels, especially when dealing with noisy annotations. AI

IMPACT Suggests optimizing annotation budgets based on evaluation metrics for improved model training.

RANK_REASON Research paper published by Apple's ML Research division. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Apple Machine Learning Research →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Apple ML Research: Annotation needs vary by evaluation metric

COVERAGE [1]

Apple Machine Learning Research TIER_1 English(EN) · 2026-06-23 00:00

Metric-Dependent Annotation Saturation for Learning from Label Distributions

When annotators disagree on a label, the disagreement itself carries signal—and the number of annotators needed to capture it depends on the evaluation metric. We fine-tune NLI models on label distributions subsampled from ChaosNLI, a dataset providing 100 independent annotator j…

COVERAGE [1]

Metric-Dependent Annotation Saturation for Learning from Label Distributions

RELATED ENTITIES

RELATED TOPICS