Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 7h

RadOT-Eval: Auditable Structured-Evidence Transport for Radiology Report Evaluation

Researchers have developed RadOT-Eval, a novel framework for evaluating the accuracy of AI-generated radiology reports. This system breaks down reports into structured clinical evidence units and uses optimal transport to align corresponding pieces of information. RadOT-Eval demonstrated strong correlations with human-annotated error burdens, outperforming existing metrics and an LLM-based evaluator on independent datasets. AI

IMPACT Provides a more auditable and accurate method for evaluating high-stakes AI-generated clinical text, potentially improving safety and reliability in medical applications.

LLM
RadOT-Eval
GREEN-radllama2-7B