COCOTree: A Dataset and Benchmark for Open Tree-Structured Visual Decomposition
Researchers have introduced COCOTree, a new dataset and benchmark designed for the task of open tree-structured visual decomposition. This task involves segmenting images into hierarchical trees of visual components with flexible granularity. The dataset was generated using a novel pipeline that combines Large Vision-Language Models with SAM 3 for semantic reasoning and geometric grounding, resulting in over 2.1K images and 1.8M structural nodes with an open vocabulary of 3.5K labels. A new evaluation metric, Open Tree Quality (OTQ), has also been proposed to assess mask precision, label accuracy, and structural consistency. AI
IMPACT Enables new research in hierarchical image segmentation and visual decomposition tasks.