Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 10h

Bridging Modality Disconnect in Self-Reflection via Closed-Loop Visually Grounded Verification

Researchers have introduced MIRROR, a new framework designed to improve the reasoning capabilities of Vision-Language Models (VLMs). MIRROR addresses the issue of hallucinations and logic errors in VLMs by incorporating a closed-loop process that includes drafting, critiquing, and visually verifying answers based on specific image regions. To train this model, a new dataset called ReflectV was created, which provides multi-turn supervision with explicit reflection triggers and region-based verification actions. AI

Hugging Face
arXiv
DagsHub
alphaXiv
MIRROR
Haoyu Zhang
ReflectV