English(EN) What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs

新指标 SCSuff 评估 LLM 解释的充分性

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-26 21:14

一项新的研究论文介绍了一种名为 SCSuff 的信息论指标，用于评估大型语言模型（LLMs）生成的自由文本解释的充分性。该研究提出，解释的充分性可能依赖于分布，并建议使用 LLM 本身来生成替代输入，从而捕捉其信念。实验表明，LLM 的解释普遍不足，并且与模型大小或准确性之间的相关性较弱，尽管 SCSuff 分数可以从模型的内部表示中预测出来。 AI

影响这项研究可能带来更可靠、更值得信赖的 LLM 解释，这对于高风险应用至关重要。

排序理由介绍评估 LLM 解释新指标的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv stat.ML 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv stat.ML TIER_1 English(EN) · Nhi Nguyen, Shauli Ravfogel, Rajesh Ranganath · 2026-06-30 04:00

What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs

arXiv:2606.28615v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in high-stakes domains, where free-text explanations such as chain-of-thought and post-hoc rationales are used to justify model outputs. Yet it remains unclear whether these e…
arXiv stat.ML TIER_1 English(EN) · Rajesh Ranganath · 2026-06-26 21:14

What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs

Large language models (LLMs) are increasingly deployed in high-stakes domains, where free-text explanations such as chain-of-thought and post-hoc rationales are used to justify model outputs. Yet it remains unclear whether these explanations are sufficient, i.e., if they contain …

报道来源 [2]

What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs

What LLMs explain is not what they believe: Evaluating explanation sufficiency under models' own input beliefs

相关实体

相关话题