GH-ESD: Grounded Hypothesis-Driven Error Slice Discovery for Instance-Level Vision Tasks
Researchers have developed GH-ESD, a new framework for discovering systematic failures in instance-level vision tasks like object detection and segmentation. Unlike previous methods that focused on image-level classification, GH-ESD uses Large Language Models (LLMs) and Vision-Language Models (VLMs) to generate and verify hypotheses about relational and spatially grounded visual patterns that cause errors. The framework also introduces GESD, a new benchmark dataset designed for instance-level error slice discovery, which has shown improved performance over existing methods. AI
IMPACT Introduces a novel method for identifying and understanding failures in vision models, potentially leading to more robust AI systems.