Beyond Symmetric Alignment: Spectral Diagnostics of Modality Imbalance in Vision-Language Models in the Medical Domain
Researchers have developed a new metric called the Spectral Alignment Score (SAS) to diagnose modality imbalance in Vision-Language Models (VLMs), particularly in the medical domain. Unlike existing symmetric metrics, SAS provides directional scores to identify which modality (image or text) is causing performance degradation. Experiments on 15 VLMs across medical and natural datasets demonstrated that SAS effectively captures the richer information in medical images compared to their text descriptions, outperforming other metrics in correlating with retrieval performance. AI
IMPACT Provides a new diagnostic tool for improving the reliability of medical VLMs.