English(EN) The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It

研究人员发现 Transformer 知道计数但难以输出

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-05 01:13

一篇新论文指出了 Transformer 模型中一个特定的瓶颈，阻碍了它们执行计数任务的能力。研究人员发现，虽然 Pythia、Qwen3 和 Mistral 等模型在内部准确地存储计数信息，但它们难以将这些信息转化为正确的输出 token。对注意力权重进行有针对性的干预，显著提高了模型在自回归任务中生成正确计数的 ist, 表明输出路径存在几何错位。 AI

影响识别出 Transformer 在计数任务中的特定读出瓶颈，可能指导未来的模型架构。

排序理由该集群包含一篇学术论文，详细介绍了关于 Transformer 模型局限性的新发现。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Gabriel Garcia · 2026-05-06 04:00

正确的答案，错误的方向：为什么 Transformer 在计数方面会失败以及如何修复它

arXiv:2605.03258v1 Announce Type: new Abstract: Large language models often fail at simple counting tasks, even when the items to count are explicitly present in the prompt. We investigate whether this failure occurs because transformers do not represent counts internally, or bec…
arXiv cs.CL TIER_1 English(EN) · Gabriel Garcia · 2026-05-05 01:13

正确的答案，错误的方向：为什么 Transformer 在计数方面会失败以及如何修复它

Large language models often fail at simple counting tasks, even when the items to count are explicitly present in the prompt. We investigate whether this failure occurs because transformers do not represent counts internally, or because they cannot convert those representations i…

报道来源 [2]

正确的答案，错误的方向：为什么 Transformer 在计数方面会失败以及如何修复它

正确的答案，错误的方向：为什么 Transformer 在计数方面会失败以及如何修复它

相关实体

相关话题