PulseAugur
EN
LIVE 23:46:17

New theory explains how pretraining shapes machine learning model fine-tuning

Researchers have developed a theoretical framework to explain how pretraining influences inductive bias during the fine-tuning of machine learning models. Their analysis, conducted on diagonal linear networks, identifies four distinct fine-tuning regimes based on initialization parameters and task statistics. The study suggests that smaller initialization scales in earlier layers of a network can enhance feature reuse and refinement, leading to better generalization on tasks that utilize a subset of the pretraining features. These findings were empirically validated using ResNets on CIFAR-100 and SVHN datasets, as well as Transformers on modular arithmetic tasks. AI

IMPACT Provides a theoretical understanding of how pretraining impacts fine-tuning, potentially guiding future model development and optimization strategies.

RANK_REASON The item is an academic paper detailing a theoretical framework for machine learning. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New theory explains how pretraining shapes machine learning model fine-tuning

COVERAGE [1]

  1. arXiv cs.LG TIER_1 English(EN) · Nicolas Anguita, Francesco Locatello, Andrew M. Saxe, Marco Mondelli, Flavia Mancini, Samuel Lippl, Clementine Domine ·

    A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning

    arXiv:2602.20062v2 Announce Type: replace Abstract: Pretraining and fine-tuning are central stages in modern machine learning systems. In practice, feature learning plays an important role across both stages: deep neural networks learn a broad range of useful features during pret…