Benchmarking Instance-Dependent Label Noise with Controlled Corruptions
Researchers have developed a new framework called CILN for generating synthetic instance-dependent label noise (IDN) benchmarks. Unlike previous methods that implicitly generated noise, CILN uses controlled input corruptions and a diverse voter pool to create benchmarks where the source and severity of ambiguity are explicit. This approach, tested on CIFAR10, MNIST, and Adult datasets, generates benchmarks that exhibit genuine instance-dependent noise and can reveal failure modes in existing noisy-label learning methods like Co-Teaching and DivideMix. AI