Adding more few-shot examples to an LLM prompt does not always improve performance, and can sometimes degrade it. In one experiment, a prompt with six examples performed worse than one with four, with the two additional examples negatively impacting accuracy. The author found that longer, more detailed examples, particularly when placed at the end of the prompt, could skew the model's output due to biases like recency and distribution shift. AI
IMPACT Demonstrates that prompt engineering requires careful selection and placement of examples, not just quantity, to avoid performance degradation.
RANK_REASON The cluster describes an experiment with LLMs and few-shot examples, akin to a research paper's findings. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →