Researchers find Transformers know counts but struggle to output them

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

A new paper identifies a specific bottleneck in Transformer models that hinders their ability to perform counting tasks. Researchers found that while models like Pythia, Qwen3, and Mistral store count information accurately internally, they struggle to translate this information into the correct output tokens. A targeted intervention on attention weights significantly improved the models' ability to generate correct counts in autoregressive tasks, suggesting a geometric misalignment in the output pathway. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Identifies a specific readout bottleneck in Transformers for counting tasks, potentially guiding future model architectures.

RANK_REASON The cluster contains an academic paper detailing a novel finding about Transformer model limitations.

Read on arXiv cs.CL →

paper
other

COVERAGE [2]

arXiv cs.LG TIER_1 · Gabriel Garcia · 2026-05-06 04:00

The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It

arXiv:2605.03258v1 Announce Type: new Abstract: Large language models often fail at simple counting tasks, even when the items to count are explicitly present in the prompt. We investigate whether this failure occurs because transformers do not represent counts internally, or bec…
arXiv cs.CL TIER_1 · Gabriel Garcia · 2026-05-05 01:13

The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It

Large language models often fail at simple counting tasks, even when the items to count are explicitly present in the prompt. We investigate whether this failure occurs because transformers do not represent counts internally, or because they cannot convert those representations i…

COVERAGE [2]

The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It

The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It

RELATED ENTITIES

RELATED TOPICS