MedSAE: Dissecting MedCLIP Representations with Sparse Autoencoders
Researchers have developed MedSAE, a method to enhance the interpretability of MedCLIP, a vision-language model used in medical imaging. By applying sparse autoencoders to MedCLIP's latent space, MedSAE aims to make AI representations in healthcare more transparent and clinically reliable. Experiments on the CheXpert dataset demonstrated that MedSAE neurons offer improved monosemanticity and interpretability compared to raw MedCLIP features, potentially paving the way for more trustworthy medical AI applications. AI
IMPACT Enhances transparency in medical AI, potentially increasing trust and adoption of AI tools in clinical settings.