PulseAugur
EN
LIVE 21:33:53

New framework probes multimodal LLMs for internal decision stress

Researchers have developed a new framework called S$^3$E to evaluate multimodal language models by probing their internal decision states under semantic stress. This method contrasts image-supported captions with semantically similar but incorrect options, analyzing hidden states to detect instability even when the model's external behavior remains correct. Studies on models like Qwen3VL, Gemma3, and InternVL3 revealed that semantic stress can cause significant internal state displacement, suggesting that external correctness alone is insufficient to guarantee stable internal decision geometry. AI

IMPACT Introduces a method to assess internal model stability beyond external performance, potentially improving safety and reliability evaluations.

RANK_REASON Academic paper introducing a new evaluation framework for multimodal language models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CL TIER_1 English(EN) · Eduard Hovy ·

    When Correct Decisions Hide Internal Stress: Decision-State Probing in Multimodal Language Models

    Multimodal language models are typically evaluated through external behavior: selecting the correct image--text match, rejecting unsupported captions, or answering visual queries correctly. However, correct behavior alone does not show that the model's internal decision state rem…