English(EN) Decomposing Prediction Mechanisms for In-Context Recall

分解上下文回忆的预测机制

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-18 04:00

研究人员开发了一套新的玩具问题，旨在探究 Transformer 模型如何处理上下文学习（ICL）和联想回忆。这些问题涉及在交错的状态观测上训练模型，这些观测来自线性动力学系统，要求模型根据符号标签回忆特定的系统状态。研究发现，模型在训练后期才发展出回忆状态的能力，而在此之前它们已经具备了继续预测序列的能力。机制分析表明，在此背景下的下一个词预测至少涉及两种不同的机制：一种用于使用离散标签进行联想回忆，另一种用于 AI

排序理由 [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Sultan Daniels, Dylan Davis, Dhruv Gautam, Wentinn Liao, Gireeja Ranade, Anant Sahai · 2026-06-18 04:00

Decomposing Prediction Mechanisms for In-Context Recall

arXiv:2507.01414v2 Announce Type: replace Abstract: We introduce a new family of toy problems that combine features of linear-regression-style continuous in-context learning (ICL) with discrete associative recall. We pretrain transformer models on sample traces from this toy, spe…