PulseAugur
EN
LIVE 12:24:31
ENTITY linear attention

linear attention

PulseAugur coverage of linear attention — every cluster mentioning linear attention across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
12
12 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
12
12 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
SENTIMENT · 30D

5 day(s) with sentiment data

RECENT · PAGE 1/1 · 12 TOTAL
  1. RESEARCH · CL_109619 ·

    Lifelong AI Learning Needs Parametric Attention in Transformers, Paper Argues

    A new research paper proposes that achieving lifelong continual learning in AI agents necessitates the use of parametric forms of attention within transformer models. The paper argues that the current quadratic complexi…

  2. RESEARCH · CL_103889 ·

    HydraHead architecture fuses attention types for improved long-context LLMs

    Researchers have introduced HydraHead, a novel architecture that hybridizes Full Attention and Linear Attention at the head level within transformer models. This approach leverages interpretability to identify critical …

  3. RESEARCH · CL_93108 ·

    New research explores hybrid and sparse attention mechanisms for LLMs

    Researchers are exploring novel methods to optimize attention mechanisms in large language models, particularly for handling long contexts. The HydraHead architecture, for instance, hybridizes Full Attention (FA) and Li…

  4. RESEARCH · CL_84359 ·

    Bayesian theory explains emergent copy heads in transformer attention

    Researchers have developed a Bayesian theory to explain the emergence of "copy heads" in transformer attention mechanisms. Their analysis of a single-layer softmax attention network reveals a phase transition in how the…

  5. TOOL · CL_82518 ·

    Blurry Window Attention improves Transformer efficiency for long contexts

    Researchers have introduced Blurry Window Attention (BLA), a novel method designed to improve the efficiency of Transformer language models in handling long contexts. BLA addresses the quadratic complexity and growing K…

  6. RESEARCH · CL_77141 ·

    New model explains how training diversity boosts transformer in-context learning

    Researchers have developed an analytical model to explain how training task diversity influences in-context learning (ICL) in transformers. The model, which treats training task vectors as low-rank Gaussians, demonstrat…

  7. RESEARCH · CL_62204 ·

    New framework unifies sequence models using Bayesian memory

    Researchers have introduced a "design-model" framework for creating efficient recurrent sequence maps based on memory assumptions. This framework uses Bayesian filtering to write evidence into memory and a query-depende…

  8. RESEARCH · CL_43909 ·

    NVIDIA unveils Gated DeltaNet-2 for improved linear attention

    NVIDIA has introduced Gated DeltaNet-2, a new linear attention layer designed to improve memory editing in recurrent neural networks. This model separates the processes of erasing old information and writing new informa…

  9. TOOL · CL_30774 ·

    OSDN improves linear attention with online preconditioning

    Researchers have introduced OSDN, a novel method that enhances linear attention mechanisms by incorporating provable online preconditioning. This technique augments the Delta Rule with a diagonal preconditioner, which i…

  10. RESEARCH · CL_34499 ·

    New attention methods tackle LLM long-context challenges

    Researchers are developing new attention mechanisms to handle increasingly long contexts in large language models. One approach, Runtime-Certified Bounded-Error Quantized Attention, uses tiered KV caches to compress mem…

  11. TOOL · CL_25583 ·

    Recurrent models fail at state tracking due to error dynamics

    Researchers have introduced a new perspective on state tracking within recurrent neural network architectures, emphasizing error control dynamics over theoretical expressive capacity. They demonstrate that affine recurr…

  12. RESEARCH · CL_05127 ·

    StateX framework boosts RNN recall by expanding model states post-training

    Researchers have developed StateX, a post-training framework designed to improve the recall capabilities of recurrent neural networks (RNNs). This method efficiently expands the states of pre-trained RNNs, such as linea…