English(EN) I now ignore a lot of Opus 4.8 spontaneous suggestions: "So my earlier “hypothesis 2” (broadcast auto-adds a peer → verify-probe TX → knocks RX out) was wrong —

Anthropic 的 Opus 4.8 准确性显著下降

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-07 14:44

用户报告称 Anthropic 的 Opus 4.8 模型存在严重问题，指出该模型提供的错误代码建议近乎一半。一位用户由于这些不准确性，已不再采纳该模型的大部分自发建议。这表明该模型的可靠性和性能可能有所下降。 AI

影响 Opus 4.8 的可靠性可能下降，这可能会影响用户信任和采用。

排序理由用户报告模型不准确，并非直接发布或基准测试。

在 Mastodon — fosstodon.org 阅读 →

模型发布

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-07 14:45

Opus 4.8 现在近 50% 的时间都出错：“关于你的新请求——你没有完全相信我的代码是正确的，这是对的。让我来实证验证一下……” # ai # opus

Opus 4.8 is incorrect almost 50% of the time now: "On your new request — you’re right not to take my code-read on faith. Let me empirically verify…" # ai # opus48
Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-06-07 14:44

我现在忽略了 Opus 4.8 的许多自发建议：“所以我早些时候的‘假设 2’（广播自动添加对等节点 → 验证探测 TX → 将 RX 敲出）是错的——

I now ignore a lot of Opus 4.8 spontaneous suggestions: "So my earlier “hypothesis 2” (broadcast auto-adds a peer → verify-probe TX → knocks RX out) was wrong — there’s no such mechanism, and I shouldn’t have floated it. Good catch; scratch that hypothesis." # ai # opus48

报道来源 [2]

Opus 4.8 现在近 50% 的时间都出错：“关于你的新请求——你没有完全相信我的代码是正确的，这是对的。让我来实证验证一下……” # ai # opus

我现在忽略了 Opus 4.8 的许多自发建议：“所以我早些时候的‘假设 2’（广播自动添加对等节点 → 验证探测 TX → 将 RX 敲出）是错的——

相关实体

相关话题