A user on Reddit reported that GPT 4.5, when prompted to generate a "skyscraper" within the MineBench framework, instead outputted the word "HELP". After approximately 30 attempts, the model consistently generated skyscrapers, with this unusual output occurring only in that specific instance. The user found this behavior intriguing, noting that the model followed the MineBench rules and tool schema precisely but substituted the requested output with "HELP". AI
IMPACT Highlights potential for unexpected model behavior and emergent properties, even in specific benchmark contexts.
RANK_REASON User-reported anomaly in a model's behavior, not an official release or benchmark.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →