Researchers have developed AC-ODM, a novel method that uses reinforcement learning to optimize the composition of pretraining data for large language models (LLMs). This approach significantly improves sample efficiency, reducing pretraining time by up to 66% while enhancing downstream accuracy on benchmarks like MMLU and HumanEval. AC-ODM offers flexibility with both proxy and direct training modes and introduces only a minimal increase in computational overhead. AI
IMPACT This method could significantly reduce the computational cost and time required for LLM pretraining, potentially accelerating development and deployment.
RANK_REASON This item is a research paper detailing a new method for LLM pretraining. [lever_c_demoted from research: ic=1 ai=1.0]
Read on Hugging Face Daily Papers →
- AC-ODM
- Dang'ai
- HumanEval
- International Conference on Machine Learning
- LLM
- Massive Multitask Language Understanding
- Pythia 1B
- reinforcement learning
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →