English(EN) Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks

Anthropic 的 Claude Opus 4.8 在基准测试中超越 GPT-5.5

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-28 21:20

Anthropic 发布了 Claude Opus 4.8，这是其大型语言模型的新迭代，与前代相比有了适度但切实的改进。最新版本在大多数基准测试中优于 OpenAI 的 GPT-5.5 和 Google 的 Gemini 3.1 Pro。此外，Claude Opus 4.8 还展示了增强的自我纠正能力，其捕获自身编码错误的速率是先前版本的四倍。 AI

影响在编码基准测试中设定了新的 SOTA；给 OpenAI 带来回应压力。

排序理由发布前沿实验室模型及系统卡。[lever_c_demoted from frontier_release: ic=1 ai=1.0]

在 The Decoder 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

Anthropic 的 Claude Opus 4.8 在基准测试中超越 GPT-5.5

报道来源 [1]

The Decoder TIER_1 English(EN) · Matthias Bastian · 2026-05-28 21:20

Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks

<p><img alt="" class="attachment-full size-full wp-post-image" height="1440" src="https://the-decoder.com/wp-content/uploads/2026/05/claude_opus_48_title-scaled.webp" style="height: auto; margin-bottom: 10px;" width="2560" /></p> <p> Anthropic releases Claude Opus 4.8, which beat…

报道来源 [1]

Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in most benchmarks

相关实体

相关话题