English(EN) UtVAA: Ultra-tiny Vision Transformer with Affix Attention for Mobile Image Classification

专为移动部署设计的超小型视觉Transformer

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员开发了UtVAA，一种针对移动和边缘设备优化的超小型视觉Transformer架构。该新模型采用了词缀注意力（Affix Attention），它将局部特征提取与线性自注意力以及用于空间建模的坐标注意力相结合。UtVAA还利用了扩张瓶颈（Dilated Bottleneck）块来有效地扩展感受野。最小的变体拥有超过20万个参数和5300万次FLOPs，在CIFAR-10和CIFAR-100等基准数据集上取得了有竞争力的准确率，证明了基于Transformer的视觉模型可以在不损失大量性能的情况下显著缩小。 AI

排序理由详细介绍一种新的计算机视觉模型架构的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Romiyal George, Sathiyamohan Nishankar, Selvarajah Thuseethan, Roshan G. Ragel · 2026-06-16 04:00

UtVAA: Ultra-tiny Vision Transformer with Affix Attention for Mobile Image Classification

arXiv:2606.14735v1 Announce Type: new Abstract: Vision Transformers (ViTs) have demonstrated strong representation capability in image classification. However, their quadratic self-attention complexity and large parameter counts limit deployment on resource-constrained mobile and…

报道来源 [1]

UtVAA: Ultra-tiny Vision Transformer with Affix Attention for Mobile Image Classification

相关实体

相关话题