METR measures GPT-4 post-training enhancements, finding significant capability gains

By PulseAugur Editorial · [1 sources] · 2024-03-15 08:00

Researchers at METR have conducted experiments to measure the impact of post-training enhancements on AI agent capabilities. Their findings indicate that OpenAI's own post-training efforts on GPT-4 significantly boosted agent performance by 26 percentage points, a gain comparable to the jump from GPT-3.5 Turbo to GPT-4. While the researchers' own attempts to further improve agent performance yielded smaller, statistically insignificant gains, they suggest that substantial capability increases may be difficult to achieve after a model has been competently fine-tuned for agency. AI

IMPACT Suggests that post-training enhancements by developers can significantly boost AI agent performance, potentially impacting safety evaluations.

RANK_REASON The cluster describes a research paper evaluating AI agent capabilities and the impact of post-training enhancements.

Read on METR (Model Evaluation & Threat Research) →

paper
safety

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

METR measures GPT-4 post-training enhancements, finding significant capability gains

COVERAGE [1]

METR (Model Evaluation & Threat Research) TIER_1 English(EN) · 2024-03-15 08:00

Measuring the impact of post-training enhancements

<p>Our <a href="https://metr.org/blog/2024-03-15-example-autonomy-evaluation-protocol/">example evaluation protocol</a> suggests adding safety margin to take into account increases in dangerous capabilities that could be unlocked by further post-training enhancements. Those enhan…

COVERAGE [1]

Measuring the impact of post-training enhancements

RELATED ENTITIES

RELATED TOPICS