PulseAugur
LIVE 08:52:50
research · [1 source] ·
0
research

METR finds GPT-5.1-Codex-Max poses low risk for AI R&D automation

METR has evaluated OpenAI's GPT-5.1-Codex-Max, finding it to be a low-risk incremental improvement over previous models. The evaluation focused on AI R&D automation and rogue replication risks, concluding that current trends suggest these threats are unlikely to materialize significantly in the next six months. However, METR acknowledges the possibility of unforeseen breakthroughs or increased compute scale impacting these projections. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Suggests current AI development trends pose low risk for AI R&D automation and rogue replication in the near term.

RANK_REASON The report is an evaluation of a specific model's safety implications, not a release of a new model or a major policy shift.

Read on METR (Model Evaluation & Threat Research) →

COVERAGE [1]

  1. METR (Model Evaluation & Threat Research) TIER_1 (CA) ·

    GPT-5.1-Codex-Max Evaluation Results

    <style> .caption { text-align: center; color: #555; font-size: 0.9em; font-style: italic; margin-top: -0.5em; margin-bottom: 1.5em; } </style> <p><strong>Note on independence:</strong> This evaluation was conducted under a standard NDA. Due to the se…