English(EN) MemoryDocDataSet: A Benchmark for Joint Conversational Memory and Long Document Reasoning

新基准和方法应对大语言模型长上下文和记忆挑战

作者 PulseAugur 编辑部 · [9 个来源] · 2026-06-02 04:00

研究人员正在开发新方法，以改进大型语言模型处理长对话历史和复杂文档的方式。几篇论文介绍了旨在克服有限上下文窗口限制的新型架构和基准测试。这些方法侧重于高效的记忆检索、摘要和跨对话及外部文档的联合推理，以增强模型在扩展交互中的性能。 AI

影响这些进展旨在显著提高大语言模型在扩展对话和复杂文档分析中的能力，从而实现更复杂的人工智能应用。

排序理由多篇学术论文介绍了处理大语言模型长上下文和记忆的新方法和基准测试。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 9 个来源。我们如何撰写摘要 →

报道来源 [9]

arXiv cs.CL TIER_1 English(EN) · Rahul Subramani · 2026-06-05 04:00

LANTERN: 用于长上下文大语言模型对话的分层存档和时间片段检索网络

arXiv:2606.05182v1 Announce Type: new Abstract: Large language models discard critical details when conversation history is compacted to fit within finite context windows. We present LANTERN (Layered Archival aNd Temporal Episodic Retrieval Network), a lightweight memory layer th…
arXiv cs.CL TIER_1 English(EN) · Aly Lidayan, Jakob Bjorner, Satvik Golechha, Kartik Goyal, Alane Suhr · 2026-06-05 04:00

ABBEL：为内存高效交互学习自然语言信念状态

arXiv:2512.20111v2 Announce Type: replace Abstract: As the time horizons of sequential decision-making tasks grow, keeping full interaction histories in model context becomes increasingly costly. Recent work reduces context lengths by instead conditioning decision-making agents o…
arXiv cs.AI TIER_1 English(EN) · Qiyang Xie, Jialun Wu, Xinjie He, Su Liu, Shuai Xiao, Zhiyuan Lin, Weikai Zhou · 2026-06-04 04:00

MemoryDocDataSet：面向联合对话记忆和长文档推理的基准测试

arXiv:2606.04442v1 Announce Type: cross Abstract: AI systems increasingly need to combine two demanding capabilities: navigating multi-session conversation history and performing deep reading comprehension within long documents. Yet no existing benchmark evaluates both simultaneo…
arXiv cs.CL TIER_1 English(EN) · Hanbo Bi, Zhiqiang Yuan, Chongyang Li, Qiwei Yan, Zexi Jia, Jiapei Zhang, Xiaoyue Duan, Yingchao Feng, Jinchao Zhang, Jie Zhou · 2026-06-04 04:00

多模态长对话中的细粒度片段检索

arXiv:2606.04591v1 Announce Type: new Abstract: With the widespread adoption of multi-modal communication platforms, long-form dialogues interleaving text and images have become increasingly common. Users often need to retrieve coherent dialogue fragments related to specific topi…
arXiv cs.CL TIER_1 English(EN) · Christian Lysenst{\o}en · 2026-06-04 04:00

面向对话记忆检索的无训练词汇密集融合

arXiv:2606.04194v1 Announce Type: cross Abstract: Retrieving the few past turns that answer a new query across long multi-session histories is the retrieval bottleneck behind long-term conversational memory (LoCoMo, LongMemEval). Recent concurrent work, Nano-Memory, shows that sc…
arXiv cs.CL TIER_1 English(EN) · Jie Zhou · 2026-06-03 08:29

多模态长对话中的细粒度片段检索

With the widespread adoption of multi-modal communication platforms, long-form dialogues interleaving text and images have become increasingly common. Users often need to retrieve coherent dialogue fragments related to specific topics, rather than isolated utterances. We propose …
Hugging Face Daily Papers TIER_1 English(EN) · 2026-06-03 04:44

MemoryDocDataSet：用于联合对话记忆和长文档推理的基准

AI systems increasingly need to combine two demanding capabilities: navigating multi-session conversation history and performing deep reading comprehension within long documents. Yet no existing benchmark evaluates both simultaneously. We introduce MemoryDocDataSet, a synthetic b…
arXiv cs.IR (Information Retrieval) TIER_1 English(EN) · Christian Lysenstøen · 2026-06-02 20:22

面向对话记忆检索的无训练词汇密集融合

Retrieving the few past turns that answer a new query across long multi-session histories is the retrieval bottleneck behind long-term conversational memory (LoCoMo, LongMemEval). Recent concurrent work, Nano-Memory, shows that scoring a session by the maximum query-turn similari…
arXiv cs.AI TIER_1 English(EN) · Jingjie Lin, Bingbing Wang, Zihan Wang, Zhengda Jin, Weiming Qiao, Jing Li, Ruifeng Xu · 2026-06-02 04:00

连接点滴：长时对话中反思性记忆的基准测试

arXiv:2606.01223v1 Announce Type: cross Abstract: Despite substantial progress in long-context modeling, existing benchmarks remain confined to factual memory for explicit recall, failing to measure the reflective memory required to synthesize fragmented, multimodal cues into hig…

报道来源 [9]

相关实体

相关话题