Brief · PulseAugur

TOOL · dev.to — LLM tag English(EN) · 2w

We Asked 10 LLMs to Write Efficient Code. Only 4 Got Better.

A recent evaluation of ten large language models revealed that only GPT-5.4 consistently improved its code efficiency when explicitly prompted to do so. While most models showed minimal or even negative impact from efficiency-focused prompts, GPT-5.4 demonstrated significant gains on tasks like configuration generation and HTML creation. Gemma 4 31B emerged as a cost-effective alternative, producing naturally efficient code at a much lower cost, whereas Cohere Command A's efficiency decreased when prompted. AI

IMPACT Confirms that explicit prompting for efficiency does not universally improve LLM code generation, highlighting model-specific behaviors and potential training misalignments.

Gemini 2.5 Flash
GPT-5.4
Kimi K2.6
Gemma 4 31B
Qwen 3.6 Plus
DeepSeek Chat
Cohere Command A