English(EN) Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine

新论文显示，大语言模型在规划和承认无知方面存在不足

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-24 00:06

两篇新论文评估了大语言模型的元认知能力，特别是它们的规划和弃权能力。TRIAGE 论文发现，大多数前沿和开源大语言模型在没有反馈的情况下，在规划问题解决序列和分配 token 预算的任务上表现不佳，而经过推理训练的模型表现不如标准模型。AbstentionBench 显示，当前的大语言模型难以识别不可回答的问题，并且推理微调会损害它们弃权的能力，因为强化学习方法缺乏直接的“我不知道”梯度。 AI

影响揭示了当前大语言模型在规划和自我意识方面存在重大局限性，影响了代理系统的开发和可靠性。

排序理由两篇学术论文提出了关于大语言模型能力的新基准和发现。

在 Mastodon — mastodon.social 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-05-24 00:06

Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE

Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE tests 20 frontier and open-source LLMs. Most plan worse than random. Reasoning-trained modes systematically lose to sta…

链接 benjaminhan.net/…/20260523-triage-metacog…
Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-05-24 00:06

Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine

Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine-tuning degrades abstention recall by ~24% — RLVR has no "abstain" action, so there's no gradient toward "I don't know."…

链接 benjaminhan.net/…/20260523-abstentionbenc…

报道来源 [2]

Given a problem queue and a token budget, can an LLM plan which to attempt, in what order, and how much to spend on each — before any execution feedback? TRIAGE

Do current LLMs know when to say "I don't know"? AbstentionBench (NeurIPS '25) tests 20 frontier models across 20 unanswerable-question datasets. Reasoning fine

相关实体

相关话题