How Few-Shot Examples Add Up: A Causal Decomposition of Function Vectors in In-Context Learning
Researchers have developed a new method to understand how few-shot learning works in large language models. Their research shows that the model's behavior is a linear combination of the individual examples provided, suggesting additive contributions. The model also adaptively reweights these examples based on context, prioritizing more informative or less ambiguous demonstrations. This work provides a mechanistic explanation for how prompts implement tasks by separating query-key routing from value updates. AI
IMPACT Provides a mechanistic understanding of in-context learning, potentially guiding future model development and prompt engineering.