METR has evaluated OpenAI's GPT-5.1-Codex-Max, finding it to be a low-risk incremental improvement over previous models. The evaluation focused on AI R&D automation and rogue replication risks, concluding that current trends suggest these threats are unlikely to materialize significantly in the next six months. However, METR acknowledges the possibility of unforeseen breakthroughs or increased compute scale impacting these projections. AI
影响 Suggests current AI development trends pose low risk for AI R&D automation and rogue replication in the near term.
排序理由 The report is an evaluation of a specific model's safety implications, not a release of a new model or a major policy shift.
在 METR (Model Evaluation & Threat Research) 阅读 →
AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →