Researchers have developed a new method called SemiPrune for efficiently pruning large datasets used in deep learning. This technique addresses the limitation of existing methods that require fully labeled data, which is often costly to obtain. SemiPrune leverages a small set of labeled data combined with semi-supervised learning to generate pseudo-labels for unlabeled data, enabling the application of supervised pruning methods. The approach estimates example difficulty from the training dynamics derived from these pseudo-labels, leading to more accurate coreset selection and state-of-the-art performance on various specialized datasets. AI
IMPACT Offers a more cost-effective way to prepare large datasets for deep learning training, potentially accelerating research and development by reducing computational and storage requirements.
RANK_REASON This is a research paper detailing a new method for dataset pruning. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →