PulseAugur
LIVE 07:54:49
research · [2 sources] ·
0
research

New benchmark reveals LVLMs hallucinate due to text priors, proposes fine-tuning fix

Researchers have developed a new benchmark called HalluScope to investigate hallucinations in large vision-language models (LVLMs). Their findings indicate that these models often generate outputs not grounded in visual input due to an over-reliance on textual priors and background knowledge, particularly from instructions. To address this, they introduced HalluVL-DPO, a fine-tuning framework that uses preference optimization to encourage more visually grounded responses, effectively mitigating specific hallucination issues while maintaining other capabilities. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Introduces a new benchmark and method to reduce hallucinations in vision-language models, potentially improving their reliability.

RANK_REASON The cluster describes a new academic paper introducing a benchmark and a fine-tuning framework for large vision-language models.

Read on Hugging Face Daily Papers →

New benchmark reveals LVLMs hallucinate due to text priors, proposes fine-tuning fix

COVERAGE [2]

  1. Hugging Face Daily Papers TIER_1 ·

    When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

    Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the…

  2. arXiv cs.CV TIER_1 · Matthieu Cord ·

    When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

    Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs that are not grounded in the visual input. Prior work has attributed hallucinations in LVLMs to factors such as limitations of the…