PulseAugur
EN
LIVE 21:01:28

New benchmark SHOVIR targets vision shortcut learning in radiology AI

Researchers have introduced SHOVIR, a new benchmark designed to evaluate vision shortcut learning in radiology report generation (RRG) models. Current RRG evaluation methods often fail to assess if diagnostic statements are based on actual visual evidence, allowing models to exploit spurious correlations. SHOVIR addresses this by using annotated datasets and occlusion experiments to identify direct and contextual shortcuts, revealing that high-performing models may still rely on shallow visual evidence. This work highlights a critical gap in RRG evaluation and advocates for region-aware assessment protocols. AI

IMPACT Highlights a critical gap in current AI evaluation for medical imaging, pushing for more robust and visually-grounded assessments.

RANK_REASON The cluster describes a new benchmark and research paper for evaluating AI models in a specific domain.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 4 sources. How we write summaries →

New benchmark SHOVIR targets vision shortcut learning in radiology AI

COVERAGE [4]

  1. arXiv cs.AI TIER_1 English(EN) · Yucheng Chen, Jinjing Zhu, Yang Yu, Yufei Shi, Hane Naghshbandi, Jinhua Liu, Angela S. Koh, Fang Fen, Kian Eng Ong, Si Yong Yeo ·

    Seeing Through Multiple Views: Parameter-Efficient Fine-Tuning via Selective Neurons for Consistent Radiology Report Generation

    arXiv:2606.31099v1 Announce Type: cross Abstract: Recent years have seen substantial advances in radiology report generation (RRG), yet existing approaches predominantly adopt direct feature fusion when handling multi-view X-ray images. Such approaches overlook the potential clin…

  2. arXiv cs.CL TIER_1 English(EN) · Filippo Ruffini, Marco Salm\'e, Rosa Sicilia, Valerio Guarrasi, Paolo Soda ·

    SHOVIR: A Benchmark for Evaluating Vision Shortcut Learning in Radiology Report Generation

    arXiv:2606.30201v1 Announce Type: cross Abstract: Current evaluation protocols for Vision-Language Models (VLMs) in Radiology Report Generation (RRG) rely on report-level metrics that measure lexical overlap or aggregate clinical correctness. However, such metrics do not test whe…

  3. arXiv cs.CL TIER_1 English(EN) · Paolo Soda ·

    SHOVIR: A Benchmark for Evaluating Vision Shortcut Learning in Radiology Report Generation

    Current evaluation protocols for Vision-Language Models (VLMs) in Radiology Report Generation (RRG) rely on report-level metrics that measure lexical overlap or aggregate clinical correctness. However, such metrics do not test whether individual diagnostic statements stem from th…

  4. arXiv cs.CV TIER_1 English(EN) · Miaojing Shi, Tianyu Cen, Zijie Yue, Meng Wei, Oluwatosin Alabi, Tom Vercauteren ·

    Multimodal Large Language Model driven Radiology Report Generation with Clinical Knowledge Enhancement

    arXiv:2403.06728v2 Announce Type: replace Abstract: Radiology report generation (RRG) has attracted significant attention due to its potential to reduce the workload of radiologists. The performance of current RRG approaches remains unsatisfactory against clinical standards. This…