Brief · PulseAugur

COMMENTARY · Mastodon — fosstodon.org English(EN) · 5h · [3 sources]

I Let 12 AI Models Predict the World Cup. The First 169 Picks Already Show a Pattern. I put 12 AI models into a public World Cup prediction arena. Not because I

A test involving 12 AI models predicting World Cup matches revealed that while no single model emerged as a clear winner, several, including Qwen3.5 Flash, Claude Opus 4.7, and Claude Sonnet 4.6, demonstrated perfect accuracy on individual predictions. A key observation was the shared bias among models to favor established favorites, leading to incorrect predictions when upsets occurred. The experiment also highlighted significant cost disparities, with cheaper models like Qwen3.5 Flash being orders of magnitude less expensive than premium models such as Claude Opus 4.7 for similar prediction tasks, suggesting a potential for cost-effective routing strategies. AI

IMPACT Highlights potential for cost-effective AI routing strategies and reveals common biases in LLM predictions.