ENTITY metre

metre

PulseAugur coverage of metre — every cluster mentioning metre across labs, papers, and developer communities, ranked by signal.

Total · 30d

0 over 90d

Releases · 30d

0 over 90d

Papers · 30d

0 over 90d

TIER MIX · 90D

No coverage in the last 90 days.

RELATIONSHIPS

competes with Claude 3.5 Sonnet 70%
used by RE-Bench 70%
used by Claude 3.5 Sonnet 50%

TIMELINE

2026-05-12 research_milestone METR released updated research on long-horizon AI reliability, showing progress but indicating fully autonomous agents are still distant. source

SENTIMENT · 30D

4 day(s) with sentiment data

RECENT · PAGE 2/2 · 27 TOTAL

FRONTIER RELEASE · CL_01848 · Sep 12 · 10:01

OpenAI releases o3 and o4-mini models with advanced reasoning and tool capabilities

OpenAI has released its new o3 and o4-mini models, which represent a significant advancement in reasoning capabilities and tool integration within ChatGPT. The o3 model is positioned as OpenAI's most powerful reasoning …
RESEARCH · CL_12647 · Aug 7 · 17:00

METR finds GPT-4o shows impressive agent skills but suffers fixable failures

METR has released preliminary findings from an evaluation of GPT-4o's autonomous capabilities across 77 tasks. The model demonstrated impressive skills like systematic exploration but also exhibited failure modes such a…
RESEARCH · CL_12648 · Mar 15 · 11:00

METR proposes autonomy evaluation protocol for AI risks

The Model Evaluation & Threat Research (METR) initiative has released an example protocol for assessing AI models' potential for autonomy-related risks. This protocol focuses on systems capable of executing harmful task…
RESEARCH · CL_12649 · Mar 15 · 09:00

METR releases guidelines for eliciting AI model capabilities and risks

The Model Evaluation & Threat Research (METR) organization has published guidelines for assessing AI model capabilities, focusing on elicitation techniques. These guidelines aim to measure a model's potential performanc…
RESEARCH · CL_12650 · Mar 15 · 08:00

METR measures GPT-4 post-training enhancements, finding significant capability gains

Researchers at METR have conducted experiments to measure the impact of post-training enhancements on AI agent capabilities. Their findings indicate that OpenAI's own post-training efforts on GPT-4 significantly boosted…
RESEARCH · CL_03855 · Feb 7 · 08:00

2023 Year In Review

METR, an AI safety research organization, detailed its 2023 accomplishments, including developing methodologies for evaluating AI agents on autonomous tasks and contributing to OpenAI's GPT-4 system card. The organizati…
SIGNIFICANT · CL_00419 · Mar 11 · 11:00

OpenAI partners with US National Labs, proposes AI policy to White House

OpenAI has submitted proposals to the White House Office of Science and Technology for the US AI Action Plan, focusing on strengthening American AI leadership through regulatory, export control, copyright, and infrastru…

OpenAI releases o3 and o4-mini models with advanced reasoning and tool capabilities

METR finds GPT-4o shows impressive agent skills but suffers fixable failures

METR proposes autonomy evaluation protocol for AI risks

METR releases guidelines for eliciting AI model capabilities and risks

METR measures GPT-4 post-training enhancements, finding significant capability gains

2023 Year In Review

OpenAI partners with US National Labs, proposes AI policy to White House