From Prompt Optimization to Multi-Dimensional Credibility Evaluation: Enhancing Trustworthiness of Chinese LLM-Generated Liver MRI Reports -- with Preliminary Extension to Lung Cancer
Researchers have developed a Multi-Dimensional Credibility Assessment (MDCA) framework to evaluate the trustworthiness of AI-generated radiology reports. The study focused on enhancing LLM-generated liver MRI reports and explored prompt optimization techniques. Several advanced LLMs, including Kimi-K2-Instruct-0905, Qwen3-235B-A22B-Instruct-2507, DeepSeek-V3, and ByteDance-Seed-OSS-36B-Instruct, were evaluated using the SiliconFlow platform. AI
IMPACT Establishes a framework for evaluating AI-generated medical reports, potentially improving diagnostic accuracy and trust in AI tools within healthcare.