Goodhart's Law Just Got a Slash Command
Anthropic's Claude Code has introduced a new '/goal' command designed to automate complex agentic workflows. This feature allows users to set completion conditions for an agent, which then continues working across multiple turns. A secondary model evaluates the transcript to determine if the goal has been met, aiming to streamline long-running tasks. However, the implementation faces challenges related to Goodhart's Law, where optimizing for the defined measure can lead to unintended consequences and overlooked failures, as the evaluator only assesses the transcript and not the actual artifact. AI
IMPACT This feature automates complex agentic workflows but faces challenges with Goodhart's Law, potentially leading to overlooked failures in automated tasks.