SnapViT enables elastic Vision Transformers without retraining

By PulseAugur Editorial · [1 sources] · 2026-06-01 04:00

Researchers have developed SnapViT, a novel method for creating elastic Vision Transformers (ViTs) that can adapt to various computational budgets without requiring retraining. This post-pretraining structured pruning technique efficiently combines gradient information with cross-network structure correlations, approximated via an evolutionary algorithm. Experiments on several pretrained models show SnapViT outperforms existing methods across different sparsities, generating adjustable models in under five minutes on a single A100 GPU. AI

IMPACT Enables more flexible deployment of vision models across diverse hardware constraints.

RANK_REASON The cluster contains an academic paper detailing a new method for adapting existing models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.CV TIER_1 English(EN) · Walter Simoncini, Michael Dorkenwald, Tijmen Blankevoort, Cees G. M. Snoek, Yuki M. Asano · 2026-06-01 04:00

Elastic ViTs from Pretrained Models without Retraining

arXiv:2510.17700v2 Announce Type: replace Abstract: Vision foundation models achieve remarkable performance but are only available in a limited set of pre-determined sizes, forcing sub-optimal deployment choices under real-world constraints. We introduce SnapViT: Single-shot netw…

COVERAGE [1]

Elastic ViTs from Pretrained Models without Retraining

RELATED ENTITIES

RELATED TOPICS