Freeze Deep, Train Shallow: Interpretable Layer Allocation for Continued Pre-Training
Researchers have developed LayerTracer, a new framework to guide the selective updating of large language model layers during continued pre-training. This method analyzes layer representation evolution and sensitivity to identify which layers are critical for task execution and stability. Experiments show that freezing deep layers while training shallow ones leads to better performance on benchmarks like C-Eval and CMMLU compared to full parameter fine-tuning or the reverse strategy. AI
IMPACT Provides a low-cost, interpretable method for optimizing LLM continued pre-training, benefiting resource-constrained teams.