Wasserstein Equilibrium Decoding for Reliable Medical Visual Question Answering
Researchers have developed a new decoding method called Wasserstein Equilibrium Decoding, designed to improve the reliability of small vision-language models (2-8B) in medical visual question answering tasks. This approach extends game-theoretic decoding to handle open-ended medical VQA by using a semantically aware Wasserstein stopping criterion. The method achieves consistent improvements on datasets like VQA-RAD and PathVQA, enhancing accuracy and reducing inference iterations compared to traditional baselines. AI
IMPACT Enhances the reliability and efficiency of small vision-language models for specialized medical applications.