English(EN) I stress-tested Claude Mythos

Anthropic 的 Claude Mythos 能够处理复杂的编码任务

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-09 21:27

一位用户通过让 Anthropic 的新 Claude Mythos 模型构建一个原型浏览器游戏来对其进行测试。该模型被评估其处理复杂、长期运行的编码任务的能力，而非简单的提示。用户发现 Mythos 更适合大型、复杂的项目和代理编码，尽管与较小模型相比速度较慢且成本较高。 AI

影响展示了先进的 AI 模型处理大规模代理编码项目的潜力。

排序理由用户驱动的压力测试，评估新模型在复杂任务上的能力。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

r/Anthropic TIER_1 English(EN) · /u/Code_Almighty · 2026-06-09 21:27

我严测了 Claude Mythos

<div class="md"><p>I wanted to test Claude’s new Mythos-class model on something more practical than just benchmark charts.</p> <p>So instead of asking it to write a blog post or solve a small coding problem, I gave it a bigger agentic coding task: build a small GT…