Researchers have introduced Key-Value Means (KVM), a new attention mechanism for transformers that can handle both fixed-size and growing states. When implemented with a fixed-size cache, KVM functions as an O(N) chunked RNN with minimal parameter additions. A growable KVM cache version demonstrates competitive performance on long-context tasks, offering subquadratic prefill time and sublinear state growth. This approach is compatible with standard operations, supports chunk-wise parallelizable training, and provides a flexible trade-off between prefill time complexity and memory usage. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Introduces a novel attention mechanism that improves transformer efficiency for long-context tasks.
RANK_REASON Publication of an academic paper detailing a new model architecture. [lever_c_demoted from research: ic=1 ai=1.0]