English(EN) Dynamic Short Convolutions Improve Transformers

动态卷积提升LLM中Transformer的性能

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-02 16:07

研究人员引入了动态短卷积作为一种新的基元，以增强大型语言模型中使用的Transformer架构。这些动态卷积利用输入相关的滤波器，在保持传统卷积的局部性偏差的同时，提高了表达能力。实验表明，在各种参数规模下，与标准Transformer和静态卷积变体相比，性能持续提高，这表明在计算优势和推进基于Transformer的语言模型方面具有潜力。 AI

影响引入了一种新颖的技术，为基于Transformer的语言模型提供了计算优势和性能提升。

排序理由该集群包含一篇详细介绍改进Transformer模型新技术的学术论文。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CL TIER_1 English(EN) · Oliver Sieberling, Bharat Runwal, Rameswar Panda, Yoon Kim · 2026-06-03 04:00

Dynamic Short Convolutions Improve Transformers

arXiv:2606.03825v1 Announce Type: cross Abstract: Transformers have become the dominant architecture for large language models, largely due to the scalability and flexibility of attention, feed-forward layers, residual connections, and normalization. This paper introduces dynamic…
arXiv cs.CL TIER_1 English(EN) · Yoon Kim · 2026-06-02 16:07

动态短卷积改进Transformer

Transformers have become the dominant architecture for large language models, largely due to the scalability and flexibility of attention, feed-forward layers, residual connections, and normalization. This paper introduces dynamic short convolutions as an additional neural networ…

报道来源 [2]

Dynamic Short Convolutions Improve Transformers

动态短卷积改进Transformer

相关实体

相关话题