PulseAugur
EN
LIVE 07:28:06

Dataset Distillation Methods Underperform Coresets in Classification Tasks

A new research paper critically evaluates dataset distillation (DD) methods for classification tasks, finding that they often do not outperform simpler data pruning techniques like coresets. The study benchmarked seven state-of-the-art DD methods against three coreset strategies on large-scale datasets, revealing that DD methods can be computationally expensive and may not offer significant accuracy advantages. Coresets, in contrast, demonstrated competitive performance and better coverage of the original data distribution, suggesting they remain a more efficient alternative for data-centric machine learning. AI

IMPACT Challenges the efficacy of dataset distillation, suggesting coresets are a more efficient alternative for data-centric learning.

RANK_REASON The cluster contains an academic paper evaluating machine learning techniques. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Joshua Kimball ·

    Rethinking Dataset Distillation for Classification: Do Distilled Sets Outperform Coresets?

    Dataset distillation (DD) has emerged as a prominent approach in data centric machine learning, aiming to synthesize compact training sets for efficient training by compressing the information in large datasets into a small number of synthetic samples. However, DD methods are oft…