English(EN) A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning

新理论解释预训练如何塑造机器学习模型的微调

作者 PulseAugur 编辑部 · [1 个来源] · 2026-07-01 04:00

研究人员开发了一个理论框架，以解释预训练如何影响机器学习模型微调过程中的归纳偏差。他们对对角线线性网络进行的分析，根据初始化参数和任务统计数据确定了四种不同的微调模式。研究表明，网络早期层中较小的初始化尺度可以增强特征重用和精炼，从而在利用预训练特征子集任务上获得更好的泛化能力。这些发现通过在 CIFAR-100 和 SVHN 数据集上使用 ResNets，以及在模块化算术任务上使用 Transformers 得到了经验验证。 AI

影响提供了对预训练如何影响微调的理论理解，可能指导未来的模型开发和优化策略。

排序理由该项目是一篇详细介绍机器学习理论框架的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Nicolas Anguita, Francesco Locatello, Andrew M. Saxe, Marco Mondelli, Flavia Mancini, Samuel Lippl, Clementine Domine · 2026-07-01 04:00

A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning

arXiv:2602.20062v2 Announce Type: replace Abstract: Pretraining and fine-tuning are central stages in modern machine learning systems. In practice, feature learning plays an important role across both stages: deep neural networks learn a broad range of useful features during pret…

报道来源 [1]

A Theory of How Pretraining Shapes Inductive Bias in Fine-Tuning

相关实体

相关话题