PulseAugur
EN
LIVE 21:39:08

Conflicting studies emerge on LLM abstention and chain-of-thought

Two recent papers present conflicting findings on whether large language models can effectively abstain from answering and if chain-of-thought prompting aids this capability. One study from COLING 2025 suggests that prompted chain-of-thought increases abstention in instruction-tuned models. Conversely, the AbstentionBench paper from NeurIPS 2025 indicates that expanding the reasoning budget reduces abstention in models trained for reasoning. AI

IMPACT Conflicting research on LLM abstention highlights ongoing challenges in model control and reliability.

RANK_REASON The cluster discusses findings from two academic papers presented at conferences, focusing on LLM capabilities. [lever_c_demoted from research: ic=1 ai=1.0]

Read on Mastodon — fosstodon.org →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT rai

    On whether LLMs can abstain effectively and whether chain-of-thought can help, two recent papers seem at odds on the surface. COLING 2025 finds prompted CoT raises abstention on instruct models. AbstentionBench (NeurIPS 2025) finds extending the reasoning budget lowers it on a tr…