Less Data, Faster Training: repeating smaller datasets speeds up learning via sampling biases
Researchers have found that repeating smaller datasets during AI model training can significantly speed up the learning process. This phenomenon, termed the "small-vs-large gap," offers compute savings compared to using larger datasets and is not fully explained by existing theories. The study suggests that this speedup is due to layer-wise growth facilitated by sampling biases, which are more effective with smaller datasets, offering a proactive optimization strategy, especially for reasoning tasks. AI
IMPACT Suggests a new method for optimizing AI training that could reduce compute costs and improve performance, particularly for reasoning tasks.