English(EN) How LLMs Detect and Correct Their Own Errors: The Role of Internal Confidence Signals

大型语言模型利用内部置信信号检测和纠正错误

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-24 06:33

研究人员调查了大型语言模型如何在没有外部输入的情况下识别和纠正自身错误，并将其与决策神经科学中的二阶置信模型进行了类比。他们的发现表明，一个在回答后缓存的特定内部信号在错误检测和自我纠正中起着至关重要的作用，其作用超越了简单的 token 对数概率。该信号不仅表明可能存在错误，还表明模型是否拥有修复该错误所需的知识，Gemma 3 27B 和 Qwen 2.5 7B 模型通过实验证明了这一点。 AI

影响揭示了大型语言模型自我纠正的内部机制，可能提高其可靠性并减少对外部验证的需求。

排序理由学术论文，详细介绍了大型语言模型自我纠正机制的一项新发现。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Dharshan Kumaran, Viorica Patraucean, Simon Osindero, Petar Velickovic, Nathaniel Daw · 2026-04-27 04:00

大型语言模型如何检测和纠正自身错误：内部置信度信号的作用

arXiv:2604.22271v1 Announce Type: new Abstract: Large language models can detect their own errors and sometimes correct them without external feedback, but the underlying mechanisms remain unknown. We investigate this through the lens of second-order models of confidence from dec…
arXiv cs.LG TIER_1 English(EN) · Nathaniel Daw · 2026-04-24 06:33

大型语言模型如何检测和纠正自身错误：内部置信度信号的作用

Large language models can detect their own errors and sometimes correct them without external feedback, but the underlying mechanisms remain unknown. We investigate this through the lens of second-order models of confidence from decision neuroscience. In a first-order system, con…

报道来源 [2]

大型语言模型如何检测和纠正自身错误：内部置信度信号的作用

大型语言模型如何检测和纠正自身错误：内部置信度信号的作用

相关实体

相关话题