Brief · PulseAugur

SIGNIFICANT · Mastodon — mastodon.social English(EN) · 4h

📰 GPT-5.5 batte Claude Fable 5 nel benchmark Agents Last Exam Un nuovo benchmark chiamato Agents Last Exam (ALE), creato dalla Berkeley RDI con oltre 300 espert

OpenAI's GPT-5.5 has outperformed Anthropic's Claude Fable 5 on a new AI benchmark called Agents Last Exam (ALE). This benchmark, developed by Berkeley RDI with input from over 300 experts, tests autonomous AI agents. The result is surprising, as Claude Fable 5 was previously considered the leading model for such tasks. AI

IMPACT Sets a new performance standard for AI agents, potentially shifting the competitive landscape and influencing future development priorities.

Anthropic
OpenAI
GPT-5.5
Agents Last Exam
Claude Fable 5
Berkeley RDI