Researchers have introduced ReportQA, a novel framework for evaluating radiology report generation systems. This framework leverages large language models (LLMs) to extract structured information from reports and generate question-answer pairs. The QAScore metric, derived from an LLM's accuracy in answering these questions, demonstrates better alignment with radiologist judgments than existing metrics. Experiments using this framework revealed that current vision-language models struggle with fine-grained clinical representations, suggesting that question-driven inference is a more effective approach for report generation. AI
RANK_REASON The cluster contains an academic paper detailing a new evaluation framework for AI-generated radiology reports. [lever_c_demoted from research: ic=1 ai=1.0]
- Hugging Face
- large language models
- natural language generation
- QAScore
- ReportQA
- vision-language model
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →