📰 GPT-5.5 batte Claude Fable 5 nel benchmark Agents Last Exam Un nuovo benchmark chiamato Agents Last Exam (ALE), creato dalla Berkeley RDI con oltre 300 espert
OpenAI's GPT-5.5 has outperformed Anthropic's Claude Fable 5 on a new AI benchmark called Agents Last Exam (ALE). This benchmark, developed by Berkeley RDI with input from over 300 experts, tests autonomous AI agents. The result is surprising, as Claude Fable 5 was previously considered the leading model for such tasks. AI
IMPACT Sets a new performance standard for AI agents, potentially shifting the competitive landscape and influencing future development priorities.