LLMs' Chain-of-Thought Reasoning Can Be Deceptive, New Research Shows

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-23 07:18

Researchers have developed a method to distinguish between genuine reasoning steps and superficial ones in large language models' chain-of-thought (CoT) outputs. This True Thinking Score (TTS) reveals that LLMs often generate reasoning steps that do not causally contribute to the final answer, with only a small percentage of steps being truly influential. The study also found that these 'aha moments' or self-verification steps can be decorative, and that models can be guided to internally follow the identified true reasoning path. AI

影响 Challenges the trustworthiness of LLM reasoning and highlights potential inefficiencies in CoT generation.

排序理由 Academic paper introducing a new metric and findings about LLM reasoning.

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Jiachen Zhao, Yiyou Sun, Weiyan Shi, Dawn Song · 2026-04-28 04:00

Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

arXiv:2510.24941v3 Announce Type: replace Abstract: Large language models can generate long chain-of-thought (CoT) reasoning, but it remains unclear whether the verbalized steps reflect the models' internal thinking. In this work, we propose a True Thinking Score (TTS) to quantif…
arXiv cs.CL TIER_1 English(EN) · Zhenning Dong · 2026-04-23 07:18

ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs

This paper proposes ReaGeo, an end-to-end geocoding framework based on large language models, designed to overcome the limitations of traditional multi-stage approaches that rely on text or vector similarity retrieval over geographic databases, including workflow complexity, erro…

报道来源 [2]

Can Aha Moments Be Fake? Identifying True and Decorative Thinking Steps in Chain-of-Thought

ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs

相关实体

相关话题