PulseAugur
EN
LIVE 11:33:40

LLM multi-hop reasoning failure linked to pretraining data

A new research paper investigates why large language models struggle with multi-hop reasoning, even when they possess the individual facts needed. The study found that models fail at combining information from separate facts to answer a new question, such as inferring a birthdate from two related pieces of information. This failure is attributed to a lack of exposure to compositional contexts during the pretraining phase, rather than an absence of knowledge. AI

IMPACT Highlights a fundamental limitation in LLM reasoning, suggesting improvements require changes to pretraining data composition.

RANK_REASON Academic paper on LLM reasoning limitations.

Read on arXiv cs.CL →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CL TIER_1 English(EN) · Valentin Barrière ·

    Multi-Hop Knowledge Composition is Bound by Pretraining Exposure

    Large Language Models fail at implicit multi-hop reasoning: a model answers "When was $X$ born?" and "Who is $Y$'s closest friend?" correctly but fails on "When was $Y$'s closest friend born?" in a single forward pass, even when both facts are perfectly memorized and individually…

  2. Hugging Face Daily Papers TIER_1 English(EN) ·

    Multi-Hop Knowledge Composition is Bound by Pretraining Exposure

    Large Language Models fail at implicit multi-hop reasoning: a model answers "When was $X$ born?" and "Who is $Y$'s closest friend?" correctly but fails on "When was $Y$'s closest friend born?" in a single forward pass, even when both facts are perfectly memorized and individually…