English(EN) How Does the Pretraining Distribution Shape In-Context Learning? A Fundamental Trade-Off

研究发现预训练数据分布塑造LLM上下文学习

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-25 04:00

一个新的理论框架和实证研究探讨了预训练数据的统计特性如何影响大型语言模型（LLM）的上下文学习（ICL）。研究人员发现，尽管重尾预训练分布在分布变化下有利于任务选择，但它会阻碍泛化能力，尤其是在数据稀疏的情况下。该研究表明，控制这些统计特性对于开发具有强大ICL能力且可靠的LLM至关重要。 AI

影响提供了关于预训练数据特性如何影响LLM适应性的理论和实证见解，可能指导未来模型开发以改进上下文学习。

排序理由学术论文，详细介绍了LLM预训练分布及其对上下文学习影响的理论框架和实证评估。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv stat.ML TIER_1 English(EN) · Wa\"iss Azizian, Ali Hasan · 2026-06-25 04:00

How Does the Pretraining Distribution Shape In-Context Learning? A Fundamental Trade-Off

arXiv:2510.01163v2 Announce Type: replace-cross Abstract: The factors driving the performance of in-context learning (ICL) in large language models (LLMs) remain poorly understood despite ICL's surprising effectiveness, enabling models to adapt to new tasks from only a handful of…