On Subquadratic Architectures: From Applications to Principles
A new research paper compares three subquadratic architectures—xLSTM, Mamba-2, and Gated DeltaNet—for sequence modeling tasks. The study found that xLSTM outperformed the others in code-model pre-training, distillation, and time-series foundation models. Researchers attribute xLSTM's superior performance to its flexible and stable memory correction capabilities through a gating scheme, enabling robust state tracking and accumulation. AI
IMPACT xLSTM's demonstrated advantage in state tracking and memory correction could influence future sequence model development, potentially leading to more efficient and capable AI systems.