Brief · PulseAugur

TOOL · arXiv cs.CL English(EN) · 6d

When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models

Researchers have developed a new framework called S$^3$E to evaluate multimodal language models by probing their internal decision states under semantic stress. This method contrasts image-supported captions with semantically similar but incorrect options, analyzing hidden states to detect instability even when the model's external behavior remains correct. Studies on models like Qwen3VL, Gemma3, and InternVL3 revealed that semantic stress can cause significant internal state displacement, suggesting that external correctness alone is insufficient to guarantee stable internal decision geometry. AI

IMPACT Introduces a method to assess internal model stability beyond external performance, potentially improving safety and reliability evaluations.

Gemma3
Qwen3VL
InternVL3
S$^3$E