Mind the Gap: Disentangling Performance Bottlenecks in Video Instance Segmentation
Researchers have developed a new diagnostic framework to analyze performance bottlenecks in video instance segmentation (VIS). This framework uses an Integer Linear Program (ILP) to isolate error sources from classification, segmentation, and tracking objectives. The analysis revealed that tracking instability is a major issue for online VIS methods, especially in longer videos or denser scenes, and that stronger backbones do not significantly improve tracking performance. AI
IMPACT Provides a systematic foundation for improving robust long-term temporal association in video instance segmentation.