English(EN) MemConflict: Evaluating Long-Term Memory Systems Under Memory Conflicts

新研究解决LLM长上下文和可靠性记忆问题

作者 PulseAugur 编辑部 · [13 个来源] · 2026-05-20 09:11

多篇研究论文探索了增强大型语言模型（LLM）记忆系统以处理长上下文和提高可靠性的新方法。这些方法包括使用测试时梯度下降将上下文写入内存、将上下文蒸馏到模块化适配器中，以及开发全面的内存基础管理系统。研究人员还专注于调试和归因这些记忆系统中的错误，提出基准和框架来识别故障模式并提高性能。 AI

影响这些进展旨在提高LLM在处理长期交互和复杂推理方面的能力，可能带来更强大、更可靠的AI应用。

排序理由多篇arXiv论文介绍了LLM记忆系统的新方法和基准。

在 arXiv cs.IR (Information Retrieval) 阅读 →

AI 生成摘要 · Google Gemini · 来自 13 个来源。我们如何撰写摘要 →

报道来源 [13]

arXiv cs.CL TIER_1 English(EN) · Yuri Kuratov, Matvey Kairov, Aydar Bulatov, Ivan Rodkin, Mikhail Burtsev · 2026-06-01 04:00

GradMem：通过测试时梯度下降学习将上下文写入内存

arXiv:2603.13875v2 Announce Type: replace Abstract: Many large language model applications require conditioning on long contexts. Transformers typically support this by storing a large per-layer KV-cache of past activations, which incurs substantial memory overhead. A desirable a…
arXiv cs.AI TIER_1 English(EN) · Ziyang Zheng, Zeju Li, Xiangyu Wen, Jianyuan Zhong, Junhua Huang, Lei Chen, Mingxuan Yuan, Qiang Xu · 2026-05-29 04:00

Context Distillation as Latent Memory Management

arXiv:2605.28889v1 Announce Type: cross Abstract: Context distillation compresses contextual information into model parameters, yet existing methods often ignore how multiple distilled latent memories should be stored, retrieved, and safely activated in non-oracle settings. We fo…
arXiv cs.AI TIER_1 English(EN) · Jiajie Fu, Junwen Chen, Mengzhao Wang, Aoxiang He, Maojia Sheng, Xiangyu Ke, Yifan Zhu, Yunjun Gao · 2026-05-29 04:00

VikingMem：面向有状态大模型应用的内存库管理系统

arXiv:2605.29640v1 Announce Type: new Abstract: Large Language Models have revolutionized interactive applications; however, their finite context windows pose a critical data management challenge for maintaining stateful, long-term interactions. Existing memory approaches often r…
arXiv cs.AI TIER_1 English(EN) · Xinle Deng, Ruobin Zhong, Hujin Peng, Xiaoben Lu, Yanzhe Wu, Guang Li, Buqiang Xu, Yunzhi Yao, Jizhan Fang, Haoliang Cao, Junjie Guo, Yuan Yuan, Ziqing Ma, Yuanqiang Yu, Rui Hu, Baohua Dong, Hangcheng Zhu, Ningyu Zhang · 2026-05-28 04:00

MemTrace：追踪和归因大型语言模型内存系统中的错误

arXiv:2605.28732v1 Announce Type: cross Abstract: Memory is essential for enabling large language models to support long-horizon reasoning, yet existing memory systems remain unreliable and difficult to debug. Tracing memory's dynamic evolution is crucial to understand how inform…
arXiv cs.AI TIER_1 English(EN) · Jeffrey Flynt · 2026-05-28 04:00

结构化信念状态与首个精度感知LLM记忆检索基准

arXiv:2605.11325v2 Announce Type: replace-cross Abstract: Every major benchmark for LLM memory systems, LoCoMo foremost, measures whether a model answered correctly, not whether the memory system retrieved correctly. A system returning its entire belief store achieves recall of 1…
arXiv cs.AI TIER_1 English(EN) · Ningyu Zhang · 2026-05-27 16:53

MemTrace：追踪和归因大型语言模型内存系统中的错误

Memory is essential for enabling large language models to support long-horizon reasoning, yet existing memory systems remain unreliable and difficult to debug. Tracing memory's dynamic evolution is crucial to understand how information is synthesized, propagated, or corrupted ove…
arXiv cs.AI TIER_1 English(EN) · Ishir Garg, Neel Kolhe, Dawn Song, Xuandong Zhao · 2026-05-27 04:00

MemFail：对大语言模型记忆系统故障模式进行压力测试

arXiv:2605.26667v1 Announce Type: new Abstract: Large language model (LLM) agents increasingly rely on external memory systems to remain consistent across long-horizon interactions, but little empirical work has been done to understand the specific failure modes and design choice…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-27 00:00

MemTrace：追踪和归因大型语言模型内存系统中的错误

Memory systems in large language models suffer from reliability issues that can be addressed through a novel tracing framework and automated fault attribution for improved performance.
arXiv cs.CL TIER_1 English(EN) · Jiangnan Yu, Kisson Songqi Lin, Jilong Wu · 2026-05-26 04:00

WhenLoss：诊断长上下文记忆系统中的写入和检索瓶颈

arXiv:2605.24579v1 Announce Type: new Abstract: Long-context memory systems often fail under fixed budgets, but end-to-end evaluation does not reveal whether evidence was discarded during compression or preserved but never retrieved. We introduce a four-condition diagnostic proto…
arXiv cs.AI TIER_1 English(EN) · Ryan Wei Heng Quek, Sanghyuk Lee, Alfred Wei Lun Leong, Arun Verma, Alok Prakash, Nancy F. Chen, Bryan Kian Hsiang Low, Daniela Rus, Armando Solar-Lezama · 2026-05-22 04:00

MeMo：记忆即模型

arXiv:2605.15156v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until subsequent updates. Many real-world applications require timely, domain-specific information, …
arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Rabab Abdelfattah · 2026-05-22 02:53

排名相同，赢家不同：评分目标如何影响大语言模型记忆基准测试

Conversational-memory systems increasingly transform dialogue history into facts, summaries, timelines, and other source-linked descendants, so a single source turn can coexist with several derived memories in the same retrieval index. This raises an underspecified evaluation que…
arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Zhiyu Li · 2026-05-20 09:11

MemConflict：在记忆冲突下评估长期记忆系统

Long-term memory systems enable conversational agents based on large language models (LLMs) to retain, retrieve, and apply user-specific information across multi-session interactions. However, existing evaluations mainly assess outcome-level performance or temporal updating, prov…
MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-05-27 05:24

MEMO：一个无需修改LLM参数即可在新增知识上训练专用记忆模型的模块化框架

<p>Researchers from NUS, MIT, and A*STAR propose MEMO, a modular framework that encodes corpus knowledge into a separate trainable MEMORY model.</p> <p>The post <a href="https://www.marktechpost.com/2026/05/26/memo-a-modular-framework-for-training-a-dedicated-memory-model-on-new-…

报道来源 [13]

相关实体

相关话题