EQPO: Equitable Group Relative Policy Optimization for Clinical Reasoning
Researchers have developed EQPO, a novel reinforcement learning method designed to improve the fairness and accuracy of AI models in clinical reasoning. This approach adaptively reweights samples to ensure balanced learning across different demographic groups, even when demographic data is unavailable, by using unsupervised clustering to identify subpopulations. EQPO has demonstrated significant reductions in accuracy disparities and F1 score gaps across various diagnostic benchmarks and modalities, while also releasing new equitability-aware clinical VLLMs that achieve state-of-the-art performance with smaller demographic gaps. AI
IMPACT Enhances fairness in clinical AI, potentially improving diagnostic outcomes for underrepresented groups and setting a new standard for equitable medical AI development.