PulseAugur
实时 20:09:36
English(EN) Self-Critique Loops for Agents: Where the 3rd Iteration Stops Helping

LLM 在没有外部反馈的情况下难以实现可靠的自我纠正

近期研究表明,大型语言模型在可靠的自我纠正方面存在困难,尤其是在没有外部反馈的情况下试图修改自己的推理时。对 Self-RefineCannot-Self-Correct 等方法的研究表明,模型最初的置信度经常会延续到修改中,从而可能降低性能。虽然 Reflexion 等方法通过外部成功/失败信号来控制自我纠正,提供了一种部分解决方案,但它们并非万无一失,如果信号不可靠,仍可能导致错误。自我纠正的有效性在一两次迭代后也会迅速下降,后续的迭代可能会引入新的错误或过度编辑正确的响应。 AI

影响 LLM 中的自我纠正循环不如之前认为的有效,尤其是在没有外部验证的情况下,这限制了它们在自主代理中的效用。

排序理由 该集群包含多篇研究论文和博客文章,讨论了 LLM 自我纠正机制的局限性。

在 dev.to — LLM tag 阅读 →

AI 生成摘要 · Google Gemini · 来自 7 个来源。 我们如何撰写摘要 →

LLM 在没有外部反馈的情况下难以实现可靠的自我纠正

报道来源 [7]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's

    Reflexion splits self-correction in two: an Evaluator that detects success/failure, and a Self-Reflection model that diagnoses what went wrong. The Evaluator's external signal — heuristic, exact-match, or test execution — gates whether diagnosis fires. When that signal misfires, …

  2. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Cannot-Self-Correct tests the strong claim that LLMs can revise their own reasoning answers without any external signal about correctness. Across three benchmar

    Cannot-Self-Correct tests the strong claim that LLMs can revise their own reasoning answers without any external signal about correctness. Across three benchmarks (GSM8K, CommonSenseQA, HotPotQA), the answer is no: the model's confidence carries over from the initial answer into …

  3. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    In Self-Refine, a single frozen LLM acts as generator, critic, and rewriter in a prompt-only loop, and the paper reports about 20 points of average lift across

    In Self-Refine, a single frozen LLM acts as generator, critic, and rewriter in a prompt-only loop, and the paper reports about 20 points of average lift across seven tasks without any training, RL, or external signal. The gains vary widely by task: small on math reasoning, but la…

  4. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    This is a 3-paper arc on whether LLMs can reliably self-correct their own reasoning. Self-Refine proposes a naive intrinsic-feedback loop and reports impressive

    This is a 3-paper arc on whether LLMs can reliably self-correct their own reasoning. Self-Refine proposes a naive intrinsic-feedback loop and reports impressive gains. Cannot-Self-Correct refutes empirically the class of approach Self-Refine belongs to. Reflexion threads the need…

  5. dev.to — LLM tag TIER_1 English(EN) · Gabriel Anhaia ·

    Self-Critique Loops for Agents: Where the 3rd Iteration Stops Helping

    <ul> <li> <strong>Book:</strong> <a href="https://www.amazon.com/dp/B0GYJZ2XJD" rel="noopener noreferrer">AI Agents Pocket Guide: Patterns for Building Autonomous Systems with LLMs</a> </li> <li> <strong>Also by me:</strong> <em>Thinking in Go</em> (2-book series) — <a href="http…

  6. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    Emotion Concepts and their Function in a Large Language Model "Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why

    Emotion Concepts and their Function in a Large Language Model "Large language models (LLMs) sometimes appear to exhibit emotional reactions. We investigate why this is the case in Claude Sonnet 4.5 and explore implications for alignment-relevant behavior. We find internal represe…

  7. Mastodon — mastodon.social TIER_1 English(EN) · [email protected] ·

    Emotion concepts and their function in a large language model \ Anthropic "All modern language models sometimes act like they have emotions. They may say they’r

    Emotion concepts and their function in a large language model \ Anthropic "All modern language models sometimes act like they have emotions. They may say they’re happy to help you, or sorry when they make a mistake. Sometimes they even appear to become frustrated or anxious when …