PulseAugur
实时 16:42:15
English(EN) When Attention Closes: How LLMs Lose the Thread in Multi-Turn Interaction

新研究发现,大型语言模型因注意力关闭而失去对话线索

一篇新的研究论文引入了一个“通道转换”框架,以解释为什么大型语言模型在扩展的多轮对话中难以维持上下文和指令。该研究提出了目标可达性比率(GAR)作为量化关键指令注意力退化的指标。研究人员发现,虽然指令的注意力可能会关闭,但相关信息可以保留在残差表示中,从而导致不同模型架构出现各种失败模式。 AI

影响 识别出大型语言模型对话能力的核心限制,可能指导未来架构的改进,以获得更好的长期记忆。

排序理由 该集群包含一篇学术论文,详细介绍了大型语言模型在多轮对话中失败的新机制解释。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新研究发现,大型语言模型因注意力关闭而失去对话线索

报道来源 [3]

  1. arXiv cs.CL TIER_1 English(EN) · Dilek Hakkani-Tür ·

    When Attention Closes: How LLMs Lose the Thread in Multi-Turn Interaction

    Large language models can follow complex instructions in a single turn, yet over long multi-turn interactions they often lose the thread of instructions, persona, and rules. This degradation has been measured behaviorally but not mechanistically explained. We propose a channel-tr…

  2. Mastodon — mastodon.social TIER_1 English(EN) · aihaberleri ·

    📰 Attention Closure in LLMs: Why Multi-Turn Conversations Lose the Thread (2026 Study) New research reveals that large language models lose track of instruction

    📰 Attention Closure in LLMs: Why Multi-Turn Conversations Lose the Thread (2026 Study) New research reveals that large language models lose track of instructions in long conversations due to a measurable 'attention closure' mechanism. The study introduces the Goal Accessibility R…

  3. Mastodon — mastodon.social TIER_1 Türkçe(TR) · aihaberleri ·

    📰 Distraction in Multi-Turn Conversations: Why is AI Losing the Thread in 2026? Why do AI models lose the thread in long conversations? Mistral AI's

    📰 Çok Aşamalı Sohbetlerde Dikkat Dağınıklığı: YZ 2026’da Neden Konuyu Kaybediyor? Yapay zeka modelleri uzun sohbetlerde neden konuyu kaybediyor? Mistral AI’nın yeni araştırması, dikkat mekanizmasının sınırlarını ve çözüm önerilerini ortaya koyuyor.... # BilimveAraştırma # AI # Te…