PulseAugur
实时 07:41:51

AI代理通过元认知和提示优化获得智能

近期研究探索了超越简单重试循环以完成复杂任务的高级代理架构。诸如“Supervising Ralph Wiggum”之类的研究表明,将元认知批评分离到一个独立的代理中,与自监控或基本重试机制相比,在设计任务上的性能得到了显著提高。ReMA等工作也呼应了这一趋势,它使用元思考器和执行器对来改进数学推理。这些论文的根本主题是分解代理功能的好处,无论是为了元认知、规划还是提示优化,这表明当前的LLM可能已经拥有更复杂的自我改进的基础元素。 AI

影响 将代理功能分解为专门的组件,在提高复杂任务性能方面显示出希望,可能导致更强大的AI系统。

排序理由 多篇研究论文和立场论文探讨了新颖的代理架构和元认知方法。

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 7 个来源。 我们如何撰写摘要 →

AI代理通过元认知和提示优化获得智能

报道来源 [7]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Supervising Ralph Wiggum: pairing a design agent with a separate metacognitive critic beats a plain retry loop AND a self-monitoring agent on battery-pack desig

    Supervising Ralph Wiggum: pairing a design agent with a separate metacognitive critic beats a plain retry loop AND a self-monitoring agent on battery-pack design. Metacognitive prompts alone don't help; moving them to a different agent does. Converges with ReMA's math-reasoning r…

  2. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    Position paper: today's self-improving agents lean on extrinsic metacognition — fixed human-designed loops about what to monitor, when to switch strategies. Gen

    Position paper: today's self-improving agents lean on extrinsic metacognition — fixed human-designed loops about what to monitor, when to switch strategies. Genuine self-improvement needs the agent itself to decide those. The intrinsic/extrinsic axis is the right lens for recent …

  3. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    MetaSPO meta-learns a task-agnostic system prompt via a bilevel loop: outer tunes system prompt across tasks, inner tunes per-task user prompts. Generalizes to

    MetaSPO meta-learns a task-agnostic system prompt via a bilevel loop: outer tunes system prompt across tasks, inner tunes per-task user prompts. Generalizes to 14 unseen tasks across 5 domains. The decomposition is the contribution. Once prompts split into task-agnostic (system) …

  4. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    ReMA trains a two-agent RL setup: a meta-thinker plans reasoning, an executor carries it out. Trained jointly with multi-agent RL, beats R1-style single-agent b

    ReMA trains a two-agent RL setup: a meta-thinker plans reasoning, an executor carries it out. Trained jointly with multi-agent RL, beats R1-style single-agent baselines on math. The split-agent pattern keeps showing up. Supervising Ralph Wiggum (engineering design, prompted) runs…

  5. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    MASS optimizes multi-agent LLM systems by interleaving prompt and topology search: block-level prompts, topology rejection sampling, then workflow-level prompts

    MASS optimizes multi-agent LLM systems by interleaving prompt and topology search: block-level prompts, topology rejection sampling, then workflow-level prompts. Topology gets quietly demoted. Ablation on Gemini 1.5 Pro: ~6% gain from block prompts, 3% from topology, 2% from work…

  6. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    DSPy turns LM pipelines into typed-module graphs and compiles them end-to-end against a single metric, bootstrapping its own few-shot demonstrations. The progra

    DSPy turns LM pipelines into typed-module graphs and compiles them end-to-end against a single metric, bootstrapping its own few-shot demonstrations. The programming-model layer is the real contribution, not any specific teleprompter. Once pipelines are typed graphs, pipeline-lev…

  7. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    EvoPrompt runs an evolutionary search over a population of prompts, with an LLM implementing crossover and mutation. Differential Evolution beats Genetic Algori

    EvoPrompt runs an evolutionary search over a population of prompts, with an LLM implementing crossover and mutation. Differential Evolution beats Genetic Algorithm on most BIG-Bench Hard tasks. One of the cleanest early examples of an LLM as *operator* in an optimization loop, no…