English(EN) In early May, the best superforecasters predicted that, by the end of the year, the longest METR 80% task horizons would reach 3-4 hours.

Anthropic 的 Claude 模型实现了 3-4 小时的 METR 任务时限

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 18:10

Anthropic 的 Claude 模型在 METR 80% 的预测中达到了 3-4 小时的任务时限这一重要里程碑。这一表现与五月初顶尖超级预测者的预测相符。Ethan Mollick 在 Bluesky 上注意到了这一成就。 AI

影响展示了 AI 处理复杂、多小时任务能力的重大进展，可能影响未来 AI 代理的能力。

排序理由该集群报告了 AI 模型的一项特定基准成就，与专家预测一致。 [lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

Bluesky Jetstream — AI desk TIER_1 English(EN) · emollick.bsky.social · 2026-06-03 18:10

In early May, the best superforecasters predicted that, by the end of the year, the longest METR 80% task horizons would reach 3-4 hours.

In early May, the best superforecasters predicted that, by the end of the year, the longest METR 80% task horizons would reach 3-4 hours. In late May, Claude Mythos achieved that number.