Revisiting Model Stitching In the Foundation Model Era
Researchers have revisited model stitching, a technique that connects early layers of one AI model to later layers of another, to explore its applicability to Vision Foundation Models (VFMs). Their study found that training the connecting 'stitch' layer is crucial for maintaining accuracy, especially at shallower connection points. By using a feature-matching loss at the target model's penultimate layer, they demonstrated that heterogeneous VFMs can be reliably stitched together for various vision tasks, sometimes even surpassing the performance of the individual models. AI
IMPACT This research offers a new method for integrating complementary strengths of different Vision Foundation Models, potentially improving performance and offering a controllable accuracy-latency trade-off for multimodal applications.