English(EN) Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

NVIDIA 发布 Gated DeltaNet-2 以改进线性注意力

作者 PulseAugur 编辑部 · [4 个来源] · 2026-05-21 17:44

NVIDIA 推出了 Gated DeltaNet-2，这是一种新的线性注意力层，旨在改进循环神经网络中的内存编辑。该模型使用独立的通道门控机制，将擦除旧信息和写入新信息的过程分离开来，解决了先前 delta-rule 架构中的局限性。Gated DeltaNet-2 在 1000 亿 token 和 13 亿参数上进行了训练，在长上下文检索任务上表现优于 Mamba-2 和 KDA 等现有模型。 AI

影响增强了循环模型中的长上下文处理能力，有望提高复杂语言任务的性能。

排序理由该集群描述了一种新的模型架构及其在基准测试上的性能，该模型在 arXiv 论文中有详细介绍，并被科技新闻媒体报道。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

报道来源 [4]

arXiv cs.AI TIER_1 English(EN) · Jan Kautz · 2026-05-21 17:44

Gated DeltaNet-2: 解耦线性注意力中的擦除和写入

Linear attention replaces the unbounded cache of softmax attention with a fixed-size recurrent state, reducing sequence mixing to linear time and decoding to constant memory. The hard part is not just what to forget, but how to edit this compressed memory without scrambling exist…
MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-05-24 07:42

NVIDIA AI 发布 Gated DeltaNet-2：一种解耦 Delta 规则中擦除和写入的线性注意力层

<p>Linear attention squeezes the unbounded KV cache into a fixed-size recurrent state, but editing that memory without scrambling existing associations is hard. Prior delta-rule models like Gated DeltaNet and KDA use one scalar gate to control both erasing old content and writing…
Mastodon — fosstodon.org TIER_1 Polski(PL) · [email protected] · 2026-05-25 11:29

NVIDIA发布Gated DeltaNet-2，一种新的线性注意力架构，通过独立的读写门大幅提高了精度

NVIDIA zaprezentowała Gated DeltaNet-2, nową architekturę liniowej atencji, która dzięki niezależnym bramkom zapisu i usuwania danych drastycznie poprawia precyzję modeli AI w długich kontekstach. # si # ai # sztucznainteligencja # wiadomości # informacje # technologia https:// a…

链接 aisight.pl/…/nvidia-gated-deltanet-2 aisight.pl/…/generatory-obrazow-ai-stereo…
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-24 07:52

NVIDIA 发布 Gated DeltaNet-2，一种线性注意力层，通过独立的通道门控解耦了擦除旧内容与写入新内容。在 1

NVIDIA has released Gated DeltaNet-2, a linear attention layer that decouples erasing old content from writing new content via separate channel-wise gates. At 1.3B parameters trained on 100B tokens, it outperforms Mamba-2, Gated DeltaNet and KDA on language modelling and long-con…

链接 marktechpost.com/…/nvidia-ai-releases-gat…

报道来源 [4]

Gated DeltaNet-2: 解耦线性注意力中的擦除和写入

NVIDIA AI 发布 Gated DeltaNet-2：一种解耦 Delta 规则中擦除和写入的线性注意力层

NVIDIA发布Gated DeltaNet-2，一种新的线性注意力架构，通过独立的读写门大幅提高了精度

NVIDIA 发布 Gated DeltaNet-2，一种线性注意力层，通过独立的通道门控解耦了擦除旧内容与写入新内容。在 1

相关实体

相关话题