Phase Transitions in Attention: A Bayesian Theory of Copy Head Emergence
Researchers have developed a Bayesian theory to explain the emergence of "copy heads" in transformer attention mechanisms. Their analysis of a single-layer softmax attention network reveals a phase transition in how these attention patterns form, dependent on the amount of training data. This theoretical framework provides a first-principles explanation for the abrupt appearance of specific subcircuits, similar to observations in large language model training. AI
IMPACT Provides a theoretical explanation for emergent behaviors in LLMs, potentially guiding future model design and training.