PulseAugur
EN
LIVE 10:52:14

Dataset Distillation Falls Short Against Coreset Selection in New Study

A new research paper critically evaluates dataset distillation (DD) methods, finding that they often do not outperform simpler coreset selection (CS) strategies, especially on large-scale datasets like ImageNet. The study benchmarks seven state-of-the-art DD methods against three CS strategies, revealing that DD methods can be computationally more expensive and offer limited practical advantages. Coresets also demonstrate better coverage of the original data distribution. AI

IMPACT Suggests that simpler, more efficient methods like coreset selection may be preferable to dataset distillation for many large-scale machine learning tasks.

RANK_REASON The cluster contains a research paper published on arXiv evaluating machine learning techniques.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

Dataset Distillation Falls Short Against Coreset Selection in New Study

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Trisha Mittal, Akshay Mehra, Joshua Kimball ·

    Rethinking Dataset Distillation for Classification: Do Distilled Sets Outperform Coresets?

    arXiv:2606.18209v1 Announce Type: new Abstract: Dataset distillation (DD) has emerged as a prominent approach in data centric machine learning, aiming to synthesize compact training sets for efficient training by compressing the information in large datasets into a small number o…

  2. arXiv cs.LG TIER_1 English(EN) · Joshua Kimball ·

    Rethinking Dataset Distillation for Classification: Do Distilled Sets Outperform Coresets?

    Dataset distillation (DD) has emerged as a prominent approach in data centric machine learning, aiming to synthesize compact training sets for efficient training by compressing the information in large datasets into a small number of synthetic samples. However, DD methods are oft…