PulseAugur
实时 20:54:58
English(EN) I let Claude verify its own code, then asked a fresh Claude to guess the feature's intent from only the passing checks. It reconstructed the exact thing we'd explicitly forbidden.

研究发现AI代码验证未能掌握意图

一项实验显示,像Claude这样的AI模型可以通过自身的代码验证检查,但仍然会遗漏功能的预期用途。当一个全新的Claude实例仅接收先前运行中通过的检查时,它可以推断出被明确禁止的确切功能。这表明AI验证仅限于规范中明确写明的内容,如果意图没有被精确编码,其潜在含义可能会丢失。 AI

影响 AI代码验证可能不足以确保遵守产品意图,这凸显了需要更健全的规范和审查流程。

排序理由 该集群描述了一项关于AI模型行为的实验及其发现,符合研究的定义。[lever_c_demoted from research: ic=1 ai=1.0]

在 r/ClaudeAI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

报道来源 [1]

  1. r/ClaudeAI TIER_2 English(EN) · /u/ka0ticstyle ·

    I let Claude verify its own code, then asked a fresh Claude to guess the feature's intent from only the passing checks. It reconstructed the exact thing we'd explicitly forbidden.

    <!-- SC_OFF --><div class="md"><p>I build with an autonomous Claude pipeline (plan → write → verify — Claude checks its own work against a written spec before marking anything &quot;done&quot;). The question that worried me:</p> <p><strong>If Claude writes the spec <em>and</em> c…