PulseAugur
实时 07:10:47

StateX framework boosts RNN recall by expanding model states post-training

Researchers have developed StateX, a post-training framework designed to improve the recall capabilities of recurrent neural networks (RNNs). This method efficiently expands the states of pre-trained RNNs, such as linear attention and state-space models, without significantly increasing model parameters. Experiments show StateX enhances recall and in-context learning performance in models up to 1.3 billion parameters, without compromising other functionalities. AI

影响 Enhances recall for RNNs, potentially improving performance on tasks requiring long-context understanding.

排序理由 This is a research paper introducing a new framework for improving RNN performance.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

StateX framework boosts RNN recall by expanding model states post-training

报道来源 [1]

  1. arXiv cs.CL TIER_1 English(EN) · Xingyu Shen, Yingfa Chen, Zhen Leng Thai, Xu Han, Zhiyuan Liu, Maosong Sun ·

    StateX: Enhancing RNN Recall via Post-training State Expansion

    arXiv:2509.22630v3 Announce Type: replace Abstract: Recurrent neural networks (RNNs), such as linear attention and state-space models, have gained popularity due to their constant per-token complexity when processing long contexts. However, these recurrent models struggle with ta…