Researchers at METR have published analyses clarifying the limitations and assumptions behind their AI time horizon metric. Recent updates to their modeling, including fixing a regularization mistake, have shown that newer models' time horizon estimates can decrease significantly, though the impact on older models is less pronounced. The researchers emphasize that the metric represents the amount of serial human labor an AI can replace, not independent work time, and that current measurements have wide confidence intervals and are sensitive to benchmark construction and task distribution. AI
Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →
RANK_REASON The cluster discusses academic research papers and analyses from METR regarding AI evaluation methodologies.