Anthropic's latest AI models show tool-use regression, report claims

By PulseAugur Editorial · [1 sources] · 2026-07-05 03:01

Armin Ronacher, creator of Flask and Jinja, has reported that Anthropic's latest AI models, Opus 4.8 and Sonnet 5, exhibit a regression in tool usage, fabricating non-existent parameters in approximately 20% of tool calls during extended coding sessions. This issue was not present in older Anthropic models or OpenAI's Codex models. Ronacher suggests that Anthropic's training environment, which is forgiving of malformed tool calls, may be the root cause, leading the models to invent fields when interacting with stricter schemas. Implementing a 'Strict mode' and removing conversational history significantly reduce these failures. AI

IMPACT Potential issues with tool use in advanced AI models could impact the reliability of AI agents in complex tasks.

RANK_REASON This is a commentary on a reported issue with Anthropic's models, not a direct release or announcement from Anthropic.

Read on dev.to — Anthropic tag →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Anthropic's latest AI models show tool-use regression, report claims

COVERAGE [1]

dev.to — Anthropic tag TIER_1 English(EN) · Breach Protocol · 2026-07-05 03:01

A Flask Creator Says Anthropic's Newest Models Got Worse at Using Tools

<p>Anthropic's newest AI models are inventing extra, made-up fields when they call external tools, according to an essay published July 4, 2026 by Armin Ronacher, the creator of the Flask and Jinja web frameworks and Sentry's founder. On long multi-step coding sessions, Opus 4.8 …

COVERAGE [1]

A Flask Creator Says Anthropic's Newest Models Got Worse at Using Tools

RELATED ENTITIES

RELATED TOPICS