English(EN) Opus 4.8 Failed A Lot Of My Coding Tests

Anthropic 的 Claude Opus 4.8 在用户测试中表现好坏参半

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-28 18:55

一位用户测试了 Anthropic 的 Claude Opus 4.8，结果好坏参半。该模型在构建功能齐全的 macOS HTML 克隆版等复杂的编码任务中表现出色。然而，在创建单个 HTML 文件中的 PS5 控制器和客户信息表等更简单的生成任务上，Opus 4.8 的表现不如早期版本。该模型还未能正确回答关于步行或驾车去洗车场的逻辑问题，这表明尽管在某些方面有所改进，但在某些方面可能存在退步。 AI

影响用户反馈表明，尽管 Claude Opus 4.8 的编码能力有所提高，但在特定任务上可能存在退步。

排序理由用户生成的模型发布评论，而非主要来源公告。

在 r/Anthropic 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Anthropic 的 Claude Opus 4.8 在用户测试中表现好坏参半

报道来源 [1]

r/Anthropic TIER_1 English(EN) · /u/LessPermission2503 · 2026-05-28 18:55

Opus 4.8 Failed A Lot Of My Coding Tests

<table> <tr><td> <a href="https://www.reddit.com/r/Anthropic/comments/1tqcs2s/opus_48_failed_a_lot_of_my_coding_tests/"> <img alt="Opus 4.8 Failed A Lot Of My Coding Tests" src="https://preview.redd.it/9fm85pfldx3h1.png?width=140&height=108&auto=webp&s=e85f559b2fc324d…

报道来源 [1]

Opus 4.8 Failed A Lot Of My Coding Tests

相关实体

相关话题