English(EN) Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

新方法 LOCOS 识别大语言模型中的非字面检索头

作者 PulseAugur 编辑部 · [2 个来源] · 2026-07-01 14:41

研究人员开发了一种名为 Logit-Contribution Scoring (LOCOS) 的新方法，用于识别大语言模型中的非字面检索头。与之前关注字面词元匹配的方法不同，LOCOS 分析注意力头的输出值电路，以了解它们如何从上下文中综合信息。这种方法在检测负责非字面检索的头方面显示出更大的有效性，涵盖了 Qwen3、Gemma-3 和 OLMo-3.1 等各种模型系列，当这些已识别的头被消融时，会导致需要综合的任务性能显著下降。 AI

影响提供了一种更准确的方法来解释大语言模型如何综合信息，这对于理解和改进长上下文能力至关重要。

排序理由介绍分析大语言模型行为新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Aryo Pradipta Gema, Beatrice Alex, Pasquale Minervini · 2026-07-02 04:00

Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

arXiv:2607.01002v1 Announce Type: cross Abstract: In long-context use, large language models frequently synthesize answers from the meaning of a relevant context span rather than literally copy-pasting them. Identifying which attention heads perform this synthesis matters for int…
arXiv cs.AI TIER_1 English(EN) · Pasquale Minervini · 2026-07-01 14:41

Logit-贡献评分识别非字面检索头

In long-context use, large language models frequently synthesize answers from the meaning of a relevant context span rather than literally copy-pasting them. Identifying which attention heads perform this synthesis matters for interpreting long-context model behavior. Yet existin…

报道来源 [2]

Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

Logit-贡献评分识别非字面检索头

相关实体

相关话题