Multi-Hop Knowledge Composition is Bound by Pretraining Exposure
A new research paper investigates why large language models struggle with multi-hop reasoning, even when they possess the individual facts needed. The study found that models fail at combining information from separate facts to answer a new question, such as inferring a birthdate from two related pieces of information. This failure is attributed to a lack of exposure to compositional contexts during the pretraining phase, rather than an absence of knowledge. AI
IMPACT Highlights a fundamental limitation in LLM reasoning, suggesting improvements require changes to pretraining data composition.