Researchers have introduced Key-Value Means (KVM), a new attention mechanism for transformers that can handle both fixed-size and growing states. When implemented with a fixed-size cache, KVM functions as an O(N) chunked RNN with minimal parameter additions. A growable KVM cache version demonstrates competitive performance on long-context tasks, offering subquadratic prefill time and sublinear state growth. This approach is compatible with standard operations, supports chunk-wise parallelizable training, and provides a flexible trade-off between prefill time complexity and memory usage. AI
影响 Introduces a novel attention mechanism that improves transformer efficiency for long-context tasks.
排序理由 Publication of an academic paper detailing a new model architecture. [lever_c_demoted from research: ic=1 ai=1.0]
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →