PulseAugur
实时 11:26:58

Researchers explore data symmetries to improve noisy dataset selection for ML

Researchers have developed a new method to identify optimal subsets of training data, particularly when dealing with label noise. This approach leverages data symmetries and invariance properties to improve the accuracy of k-nearest neighbors (k-NN) in selecting low-noise samples. The findings suggest that exploiting these underlying symmetries can lead to performance comparable to training on noise-free datasets, even in high-dimensional settings. AI

影响 Improves robustness of models trained on potentially noisy real-world datasets.

排序理由 Academic paper detailing a novel method for data selection in machine learning. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

Researchers explore data symmetries to improve noisy dataset selection for ML

报道来源 [1]

  1. arXiv cs.LG TIER_1 English(EN) · Kumar Shubham, Pavan Karjol, Kiran M K, Prathosh AP ·

    Leveraging Data Symmetries to Select an Optimal Subset of Training Data under Label Noise

    arXiv:2605.01874v1 Announce Type: new Abstract: The performance of machine learning models often relies on large labeled datasets; however, data collected from diverse sources can contain label noise. Recent work has shown that, in noisy settings, there may exist a subset of the …