English(EN) Adaptive Head Budgeting for Efficient Multi-Head Attention

BudgetFormer 通过自适应注意力头分配降低 Transformer 成本

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-24 14:15

研究人员开发了 BudgetFormer，一种 Transformer 架构，通过动态分配计算资源来优化多头注意力的使用。这种新机制学会为每个输入选择信息量最大的注意力头，减少不必要的计算并可能提高性能。在文本分类任务上的实验表明，BudgetFormer 在匹配或超过标准全多头注意力的有效性的同时，可以减少 FLOPs 和内存使用。 AI

影响引入了一种在不牺牲性能的情况下降低 Transformer 推理计算成本的方法。

排序理由介绍 Transformer 模型新架构修改的学术论文。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Bilal Faye, Abdoulaye Mbaye, Hanane Azzag, Mustapha Lebbah · 2026-04-27 04:00

Adaptive Head Budgeting for Efficient Multi-Head Attention

arXiv:2604.22583v1 Announce Type: new Abstract: Transformers have become the dominant architecture across a wide range of domains, largely due to the effectiveness of multi-head attention in capturing diverse representation subspaces. However, standard multi-head attention activa…
arXiv cs.LG TIER_1 English(EN) · Mustapha Lebbah · 2026-04-24 14:15

Adaptive Head Budgeting for Efficient Multi-Head Attention

Transformers have become the dominant architecture across a wide range of domains, largely due to the effectiveness of multi-head attention in capturing diverse representation subspaces. However, standard multi-head attention activates all heads uniformly for every input, regardl…

报道来源 [2]

Adaptive Head Budgeting for Efficient Multi-Head Attention

Adaptive Head Budgeting for Efficient Multi-Head Attention

相关实体

相关话题