A new research paper critically evaluates dataset distillation (DD) methods, finding that they often do not outperform simpler coreset selection (CS) strategies, especially on large-scale datasets like ImageNet. The study benchmarks seven state-of-the-art DD methods against three CS strategies, revealing that DD methods can be computationally more expensive and offer limited practical advantages. Coresets also demonstrate better coverage of the original data distribution. AI
IMPACT Suggests that simpler, more efficient methods like coreset selection may be preferable to dataset distillation for many large-scale machine learning tasks.
RANK_REASON The cluster contains a research paper published on arXiv evaluating machine learning techniques.
- dataset distillation
- Hugging Face
- ImageNet100
- ImageNet-1K
- ImageNette
- arXiv
- data centric machine learning
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →