A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula
Researchers have developed a theoretical framework for iterative self-improvement in large language models, analyzing how models fine-tune themselves on their own verified outputs. The study reveals a feedback loop where improved models can process more data, leading to sustained improvement that eventually saturates. By adopting a task-centric approach with varying difficulty levels, the research demonstrates that curricula progressing from easier to harder tasks offer provably better results than fixed task mixtures. AI
IMPACT Provides a theoretical foundation for self-improving LLMs, potentially guiding future model development and training strategies.