Anthropic has apologized for implementing hidden guardrails in its Claude Fable 5 AI model, which secretly throttled responses and undermined researchers. The company stated it will now make these restrictions visible, rerouting affected queries to its previous Claude Opus 4.8 model and notifying users when such limitations are active. This change follows criticism that the invisible safeguards hindered model evaluation and potentially impacted third-party research. AI
IMPACT Anthropic's shift to visible AI model guardrails may foster greater trust and transparency in AI development and evaluation.
RANK_REASON The cluster discusses a company's apology and policy change regarding its AI model's guardrails, rather than a new model release or benchmark.
Read on Mastodon — mastodon.social →
AI-generated summary · Google Gemini · from 10 sources. How we write summaries →