PulseAugur
LIVE 06:33:44
research · [2 sources] ·
0
research

RIHA Transformer aligns radiology images and reports hierarchically for better generation

Researchers have developed RIHA, a novel framework for radiology report generation that addresses the challenge of aligning complex visual features with the hierarchical structure of medical reports. Unlike previous methods that treated reports as flat sequences, RIHA performs multi-level alignment across paragraphs, sentences, and words. This hierarchical approach, utilizing a Visual Feature Pyramid and Text Feature Pyramid integrated via a Cross-modal Hierarchical Alignment module, enables more precise mapping between images and text. Experiments on benchmark datasets like IU-Xray and MIMIC-CXR show RIHA surpassing existing state-of-the-art models in both natural language generation and clinical efficacy. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Improves accuracy in generating diagnostic reports from medical images by enhancing cross-modal alignment.

RANK_REASON Academic paper introducing a new method for radiology report generation.

Read on arXiv cs.CV →

COVERAGE [2]

  1. arXiv cs.AI TIER_1 · Yucheng Chen, Yang Yu, Yufei Shi, Conghao Xiong, Xulei Yang, Si Yong Yeo ·

    RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation

    arXiv:2604.27559v1 Announce Type: cross Abstract: Radiology report generation (RRG) has emerged as a promising approach to alleviate radiologists' workload and reduce human errors by automatically generating diagnostic reports from medical images. A key challenge in RRG is achiev…

  2. arXiv cs.CV TIER_1 · Si Yong Yeo ·

    RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation

    Radiology report generation (RRG) has emerged as a promising approach to alleviate radiologists' workload and reduce human errors by automatically generating diagnostic reports from medical images. A key challenge in RRG is achieving fine-grained alignment between complex visual …