新的ALVTS方法通过自适应令牌选择提升LVLM效率

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-12 08:58

研究人员推出了一种名为自适应层级视觉令牌选择（ALVTS）的新框架，旨在提高大型视觉语言模型（LVLM）的效率。与先前永久丢弃令牌的方法不同，ALVTS动态选择重要令牌进行进一步处理，同时允许不太关键的令牌跳过某些层。这种自适应方法在无需重新训练模型的情况下最大限度地减少了计算冗余。实验表明，ALVTS可以在LLaVA-1.5、LLaVA-NeXT和Qwen2.5-VL等基准测试中实现89%的令牌压缩率，同时保留原始模型96.7%的准确性。 AI

影响该方法提供了一种显著降低LVLM计算负载的方式，可能支持更广泛的部署和更快的推理。

排序理由该集群包含一篇详细介绍提高LVLM效率新方法的论文。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Yongru Chen, Kai Zhang, Zeliang Zong, Yuchen Lu, Wenming Tan, Ye Ren, Jilin Hu · 2026-06-15 04:00

One Layer's Trash is Another Layer's Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs

arXiv:2606.14277v1 Announce Type: new Abstract: Large Vision-Language Models (LVLMs) have achieved remarkable success across diverse multimodal tasks, yet their practical deployment remains constrained by the computational burden arising from lengthy visual tokens. While visual t…
arXiv cs.CV TIER_1 English(EN) · Jilin Hu · 2026-06-12 08:58

One Layer's Trash is Another Layer's Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs

Large Vision-Language Models (LVLMs) have achieved remarkable success across diverse multimodal tasks, yet their practical deployment remains constrained by the computational burden arising from lengthy visual tokens. While visual token pruning has emerged as a promising solution…

报道来源 [2]

One Layer's Trash is Another Layer's Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs

One Layer's Trash is Another Layer's Treasure: Adaptive Layer-wise Visual Token Selection in LVLMs

相关实体

相关话题