PulseAugur
实时 12:43:59

Transformer 研究揭示组合算术泛化能力

研究人员发表了一项关于 Transformer 中组合算术的 연구,探讨了这些模型如何泛化到未见过的变量和数字组合。该研究题为“Assign and Add: A Mechanistic Study of Compositional Arithmetic”,分析了一个涉及变量赋值和模加的受控环境。研究结果表明,组合泛化可以自然地从 Transformer 的内部机制中产生,训练动态显示出不同的学习阶段。 AI

影响 为 Transformer 如何实现组合泛化提供了理论和实证见解,可能为未来的模型架构提供信息。

排序理由 该集群包含一篇学术论文,详细介绍了 Transformer 组合泛化的机制研究。

在 arXiv stat.ML 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv stat.ML TIER_1 English(EN) · Brady Exoo, Alberto Bietti, John Sous ·

    分配与添加:组合算术的机制研究

    arXiv:2605.31497v1 Announce Type: cross Abstract: Large language models are able to compose skills in order to perform complex tasks, many of which might not have been seen during training. The details of how exactly this composition occurs remain elusive. In this paper, we study…

  2. arXiv stat.ML TIER_1 English(EN) · John Sous ·

    分配与添加:组合算术的机制研究

    Large language models are able to compose skills in order to perform complex tasks, many of which might not have been seen during training. The details of how exactly this composition occurs remain elusive. In this paper, we study a mechanism for compositional generalization in t…