StateX framework boosts RNN recall by expanding model states post-training

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed StateX, a post-training framework designed to improve the recall capabilities of recurrent neural networks (RNNs). This method efficiently expands the states of pre-trained RNNs, such as linear attention and state-space models, without significantly increasing model parameters. Experiments show StateX enhances recall and in-context learning performance in models up to 1.3 billion parameters, without compromising other functionalities. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Enhances recall for RNNs, potentially improving performance on tasks requiring long-context understanding.

RANK_REASON This is a research paper introducing a new framework for improving RNN performance.

Read on arXiv cs.CL →

COVERAGE [1]

arXiv cs.CL TIER_1 · Xingyu Shen, Yingfa Chen, Zhen Leng Thai, Xu Han, Zhiyuan Liu, Maosong Sun · 2026-04-27 04:00

StateX: Enhancing RNN Recall via Post-training State Expansion

arXiv:2509.22630v3 Announce Type: replace Abstract: Recurrent neural networks (RNNs), such as linear attention and state-space models, have gained popularity due to their constant per-token complexity when processing long contexts. However, these recurrent models struggle with ta…

COVERAGE [1]

StateX: Enhancing RNN Recall via Post-training State Expansion

RELATED ENTITIES

RELATED TOPICS