SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction
Researchers have introduced SetCon, a novel approach to open-ended referring segmentation that treats multiple targets as a coherent set rather than individual outputs. This method reformulates the problem as explicit set-level concept prediction, leveraging natural-language concepts generated by Large Vision Language Models (LVLMs). SetCon first predicts a broad set-level concept and then refines it into finer-grained groups, achieving state-of-the-art results on image and video benchmarks, particularly when dealing with an increasing number of referred targets. AI
IMPACT Improves segmentation accuracy for complex, multi-target scenarios, potentially enhancing AI's ability to understand and interact with visual scenes.