Associative-State Universal Transformers improve parameter efficiency with sparse retrieval

作者 PulseAugur 编辑部 · [1 个来源] · 2026-04-30 04:00

Researchers have developed UniMatrix, a novel Universal Transformer architecture that integrates structured recurrence with sparse retrieval mechanisms. While initial versions showed parameter efficiency and competitive performance on standard language modeling tasks like WikiText-2, they struggled with associative recall. A subsequent iteration, UniMatrix-SparsePointer, significantly improved associative recall accuracy by incorporating sparse slot routing and pointer-logit fusion, achieving near-perfect performance on specific benchmarks with fewer parameters than traditional Transformers. AI

影响 Introduces a parameter-efficient architecture that combines recurrence with sparse retrieval, potentially improving long-range dependency handling in language models.

排序理由 This is a research paper detailing a new model architecture and its performance on various benchmarks.

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CL TIER_1 English(EN) · Liu Xiao · 2026-04-30 04:00

Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence

arXiv:2604.25930v1 Announce Type: new Abstract: We study whether a structured recurrent state can serve as a compact associative backbone for language modeling while still supporting exact retrieval. We introduce UniMatrix, a Universal Transformer style family that reuses a share…

报道来源 [1]

Associative-State Universal Transformers: Sparse Retrieval Meets Structured Recurrence

相关实体

相关话题