PulseAugur
实时 10:51:48

New methods tackle noisy labels in AI datasets

Researchers have developed a new method called Standardized Loss Aggregation (SLA) to detect noisy labels in large datasets, particularly in medical imaging. SLA quantifies label reliability by analyzing standardized losses from cross-validation runs, offering a more continuous and informative measure than simple hard-counting methods. Experiments show SLA is more effective and faster at identifying ambiguous or mislabeled samples, which can help improve dataset quality for classification tasks. Another study highlights a problem called "uncertainty collapse" where models trained on noisy labels achieve high accuracy but fail to reliably distinguish out-of-distribution data from misclassified in-distribution data. AI

影响 New techniques for handling noisy labels can improve the reliability and robustness of AI models, especially in critical domains like medical imaging.

排序理由 The cluster contains two academic papers detailing new methods for handling noisy labels in machine learning.

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.AI TIER_1 English(EN) · Inhyuk Park, Doohyun Park ·

    Task-Agnostic Noisy Label Detection via Standardized Loss Aggregation

    arXiv:2605.10165v2 Announce Type: replace-cross Abstract: Noisy labels are common in large-scale medical imaging datasets due to inter-observer variability and ambiguous cases. We propose a statistically grounded and task-agnostic framework, Standardized Loss Aggregation (SLA), f…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    When Accuracy Is Not Enough: Uncertainty Collapse between Noisy Label Learning and Out-of-Distribution Detection

    Learning with noisy labels (LNL) is typically benchmarked by closed-set classification accuracy, yet deployment often requires classifiers to reject out-of-distribution (OOD) inputs. We present a learner-agnostic ACC-OOD benchmark that freezes LNL checkpoints and evaluates them w…