PulseAugur
EN
LIVE 05:15:46

New method LOCOS identifies non-literal retrieval heads in LLMs

Researchers have developed a new method called Logit-Contribution Scoring (LOCOS) to identify non-literal retrieval heads in large language models. Unlike previous methods that focused on literal token matching, LOCOS analyzes the output-value circuit of attention heads to understand how they synthesize information from context. This approach has shown greater effectiveness in detecting heads responsible for non-literal retrieval across various model families, including Qwen3, Gemma-3, and OLMo-3.1, leading to significant performance drops in tasks requiring synthesis when these identified heads are ablated. AI

IMPACT Provides a more accurate method for interpreting how LLMs synthesize information, crucial for understanding and improving long-context capabilities.

RANK_REASON Academic paper introducing a new method for analyzing LLM behavior. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New method LOCOS identifies non-literal retrieval heads in LLMs

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Pasquale Minervini ·

    Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

    In long-context use, large language models frequently synthesize answers from the meaning of a relevant context span rather than literally copy-pasting them. Identifying which attention heads perform this synthesis matters for interpreting long-context model behavior. Yet existin…