Encoded but Not Routed: Explaining the Table-Chart Gap in Scientific Claim Verification
Researchers have identified why multimodal large language models struggle with verifying scientific claims presented in charts compared to tables. Through layer-wise linear probing and attention analysis on three open-weight VLMs, they found that information from charts is encoded in the models' intermediate representations but fails to reach the prediction layer. This disconnect, which does not occur with tables, suggests the issue is not with encoding visual data but with routing it effectively for prediction. AI
IMPACT Identifies a specific routing failure in multimodal models, potentially guiding future architectural improvements for better visual data understanding.