English(EN) Deep sequence models tend to memorize geometrically; it is unclear why

深度序列模型中发现新的“几何记忆”

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-19 04:00

研究人员在深度序列模型中识别出一种新的记忆存储形式，称为“几何记忆”，它不同于典型的联想记忆。这种几何记忆使模型能够综合实体之间的全局关系，即使这些实体在训练数据中从未一起出现过。研究表明，与主流理论相反，这种现象自然源于谱偏差，并为增强Transformer的记忆能力提供了见解。 AI

影响引入了一个理解模型记忆的新理论框架，可能指导未来在知识获取和模型容量方面的研究。

排序理由该集群包含一篇详细介绍深度序列模型新发现的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv stat.ML TIER_1 English(EN) · Shahriar Noroozizadeh, Vaishnavh Nagarajan, Elan Rosenfeld, Sanjiv Kumar · 2026-05-19 04:00

Deep sequence models tend to memorize geometrically; it is unclear why

arXiv:2510.26745v3 Announce Type: replace-cross Abstract: Deep sequence models are said to store atomic facts predominantly in the form of associative memory: a brute-force lookup of co-occurring entities. We identify a dramatically different form of storage of atomic facts that …