Researchers have developed CLEAR-MoE, a novel post-training method to transform frozen Vision Transformers (ViTs) into sparse Mixture-of-Experts (MoE) models without altering the original backbone weights. This technique involves a four-phase pipeline that scores and decomposes feed-forward network layers, trains lightweight routers, and dispatches tokens. Experiments on various ViT backbones demonstrated that CLEAR-MoE can retain nearly all of the dense model's accuracy, with the shared singular value decomposition (SVD) basis being crucial for preserving performance. While routing and overhead introduce a slight slowdown in FFN execution, the approach shows promise for efficient MoE model creation. AI
IMPACT Enables efficient creation of sparse Mixture-of-Experts models from existing Vision Transformers without retraining.
RANK_REASON The cluster contains a research paper detailing a new method for converting existing models. [lever_c_demoted from research: ic=1 ai=1.0]
- CLEAR-MoE
- DeiT-Base
- DeiT-Small
- DeiT-Tiny
- Imagenette
- k-means clustering
- Md. Irtiza Hossain
- singular value decomposition
- Vision Transformer
- ViT-Small
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →