Few-shot examples can hurt LLM prompt performance

By PulseAugur Editorial · [1 sources] · 2026-06-11 13:00

Adding more few-shot examples to an LLM prompt does not always improve performance, and can sometimes degrade it. In one experiment, a prompt with six examples performed worse than one with four, with the two additional examples negatively impacting accuracy. The author found that longer, more detailed examples, particularly when placed at the end of the prompt, could skew the model's output due to biases like recency and distribution shift. AI

IMPACT Demonstrates that prompt engineering requires careful selection and placement of examples, not just quantity, to avoid performance degradation.

RANK_REASON The cluster describes an experiment with LLMs and few-shot examples, akin to a research paper's findings. [lever_c_demoted from research: ic=1 ai=1.0]

Read on dev.to — LLM tag →

paper

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Few-shot examples can hurt LLM prompt performance

COVERAGE [1]

dev.to — LLM tag TIER_1 English(EN) · Ken Imoto · 2026-06-11 13:00

I Added 6 Few-Shot Examples to One Prompt. Two of Them Made the Output Worse.

<p>For a long time I treated few-shot examples like seasoning. More is more. If two examples made a prompt better, six would make it great, and I never bothered to check the math on that assumption.</p> <p>Last month I sat down with one classification prompt and actually measured…

COVERAGE [1]

I Added 6 Few-Shot Examples to One Prompt. Two of Them Made the Output Worse.

RELATED ENTITIES

RELATED TOPICS