Brief · PulseAugur

COMMENTARY · dev.to — LLM tag English(EN) · 4h · [2 sources]

We Tracked 1M LLM API Calls — 60% Were Wasting Money on the Wrong Model

A recent analysis of one million LLM API calls revealed that a significant portion of AI spending is being wasted due to developers defaulting to more expensive, powerful models than necessary for their tasks. The study found that 60-70% of API calls could be handled by cheaper models, with potential savings of up to 95% by implementing model routing and prompt caching strategies. This inefficiency contributes to rising AI costs, with average monthly spend reaching $85,500 per company in 2025. AI

IMPACT Highlights significant cost-saving opportunities for AI operators through optimized model selection and routing.

OpenAI
GPT-4o
DeepSeek V3
GPT-4o-mini
Claude Sonnet 4
Stack Overflow
Claude Haiku 3.5
CloudZero
Tokonomics
Prem AI