Visual Genome
PulseAugur coverage of Visual Genome — every cluster mentioning Visual Genome across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
VLMs enable open-vocabulary video scene graph generation
A new method for Video Scene Graph Generation (SGG) leverages Vision-Language Models (VLMs) to create structured, machine-readable descriptions of video content. Unlike traditional SGG methods that rely on fixed vocabul…
-
Multimodal LLMs Enhance Understanding with Diverse Data Types
Multimodal applications are systems that process and generate various data types like text, images, and audio, enabling LLMs to understand the world more like humans. Datasets such as Conceptual Captions and Visual Geno…
-
New benchmark and model advance explainable AI for content moderation
Researchers have developed SenBen, a new benchmark dataset for explainable content moderation in images, featuring scene graphs with detailed object attributes and sensitivity tags. They also created a compact 241M para…
-
New framework U-CECE enhances AI explainability with multi-resolution concept analysis
Researchers have introduced U-CECE, a novel framework designed to enhance the explainability of complex AI models. This universal, multi-resolution system offers adaptable levels of conceptual counterfactual explanation…