tool · [1 source] · 2026-05-21 16:14

Advanced LLMs show inverse scaling in forecasting, study finds

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

A new research paper released on arXiv suggests that more capable large language models may perform worse on forecasting tasks involving financial or epidemiological data. The study found that these advanced models tend to produce less accurate distributional forecasts when the underlying data exhibits superlinear growth or a risk of regime change. This inverse scaling effect was observed across simulated and real-world datasets, with the failure concentrated in the upper tail of predictions, where models aggressively extrapolate growth. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Suggests current LLM evaluation metrics may not capture real-world forecasting risks, potentially impacting deployment in critical domains.

RANK_REASON The cluster contains an academic paper detailing new research findings on LLM capabilities. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
safety

COVERAGE [1]

arXiv cs.AI TIER_1 · Ezra Karger · 2026-05-21 16:14

Is Capability a Liability? More Capable Language Models Make Worse Forecasts When It Matters Most

We document inverse scaling in LLMs on forecasting problems whose underlying time series exhibit superlinear growth and tail risk of regime change, a structure common in finance and epidemiology. On these tasks, more capable models produce worse distributional forecasts. The patt…

COVERAGE [1]

Is Capability a Liability? More Capable Language Models Make Worse Forecasts When It Matters Most

RELATED ENTITIES

RELATED TOPICS