New benchmarks and methods tackle AI agent memory limitations

Qwen tech blog TIER_1 English(EN) · QwenTeam · 2026-04-15 02:00

Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All

Following the launch of Qwen3.6-Plus, we are excited to open-source Qwen3.6-35B-A3B — a sparse yet remarkably capable mixture-of-experts (MoE) model with 35 billion total parameters and only 3 billion active parameters. Despite its efficiency, Qwen3.6-35B-A3B delivers outstanding…

arXiv cs.CL TIER_1 English(EN) · Jingyi Peng, Zhongwei Wan, Weiting Liu, Qiuzhuang Sun · 2026-05-25 04:00

PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents

arXiv:2605.12260v2 Announce Type: replace Abstract: Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the co…

arXiv cs.CL TIER_1 English(EN) · Alina Shutova, Alexandra Olenina, Ivan Vinogradov, Anton Sinitsin · 2026-05-25 04:00

Evaluating Memory Structure in LLM Agents

arXiv:2602.11243v2 Announce Type: replace-cross Abstract: Modern LLM-based agents and chat assistants rely on long-term memory frameworks to store reusable knowledge, recall user preferences, and augment reasoning. As researchers create more complex memory architectures, it becom…

arXiv cs.CL TIER_1 English(EN) · Jingru Lin, Chen Zhang, Stephen Y. Liu, Haizhou Li · 2026-05-22 04:00

RAGCap-Bench: Benchmarking Capabilities of LLMs in Agentic Retrieval Augmented Generation Systems

arXiv:2510.13910v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) mitigates key limitations of Large Language Models (LLMs)-such as factual errors, outdated knowledge, and hallucinations-by dynamically retrieving external information. Recent work extends th…

arXiv cs.LG TIER_1 English(EN) · Dianzhi Yu, Vireo Zhang, Hongru Wang, Yanyu Chen, Minda Hu, Wanghan Xu, Siki Chen, Philip Torr, Zhenfei Yin, Irwin King · 2026-05-22 04:00

Dynamic Mixture of Latent Memories for Self-Evolving Agents

arXiv:2605.21951v1 Announce Type: new Abstract: Achieving self-evolution in intelligent agents requires the continual accumulation of new knowledge across changing task sequences without forgetting previously acquired abilities. Existing approaches either internalize knowledge by…

arXiv cs.LG TIER_1 English(EN) · Sikuan Yan, Ahmed Bahloul, Ercong Nie, Susanna Schwarzmann, Riccardo Trivisonno, Volker Tresp, Yunpu Ma · 2026-05-22 04:00

Memory-R2: Fair Credit Assignment for Long-Horizon Memory-Augmented LLM Agents

arXiv:2605.21768v1 Announce Type: new Abstract: Memory-augmented LLM agents enable interactions that extend beyond finite context windows by storing, updating, and reusing information across sessions. However, training such agents with reinforcement learning in multi-session envi…

arXiv cs.CL TIER_1 English(EN) · Weiwei Xie, Shaoxiong Guo, Fan Zhang, Tian Xia, Xue Yang, Lizhuang Ma, Junchi Yan, Qibing Ren · 2026-05-22 04:00

MemEvoBench: Benchmarking Safety Risks from Memory Misevolution in LLM Agents

arXiv:2604.15774v2 Announce Type: replace Abstract: Equipping Large Language Models (LLMs) with persistent memory enhances interaction continuity and personalization but introduces new safety risks. Specifically, contaminated or biased memory accumulation can trigger abnormal age…

arXiv cs.AI TIER_1 English(EN) · Jiawei He, Jie Jia, Chenbo Liu, Chaoyi Xue, Yapeng Song, Xikai Yang, Dong Sun · 2026-05-22 04:00

ProcBench: Evaluating Process-Level Defects and Control Preservation in LLM Coding Agents

arXiv:2605.20251v2 Announce Type: cross Abstract: Existing benchmarks for LLM coding agents primarily evaluate final outcomes. While useful for measuring overall capability, these metrics provide limited visibility and often miss defects that arise during execution. We present Pr…

arXiv cs.AI TIER_1 English(EN) · Haozhen Zhang, Haodong Yue, Tao Feng, Quanyu Long, Jianzhu Bao, Bowen Jin, Weizhi Zhang, Xiao Li, Jiaxuan You, Chengwei Qin, Wenya Wang · 2026-05-22 04:00

Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory

arXiv:2602.06025v2 Announce Type: replace-cross Abstract: Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may di…

arXiv cs.AI TIER_1 English(EN) · Dongming Jiang, Yi Li, Songtao Wei, Jinxin Yang, Ayushi Kishore, Alysa Zhao, Dingyi Kang, Xu Hu, Feng Chen, Qiannan Li, Bingzhe Li · 2026-05-22 04:00

Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations

arXiv:2602.19320v2 Announce Type: replace-cross Abstract: Agentic memory systems enable large language model (LLM) agents to maintain state across long interactions, supporting long-horizon reasoning and personalization beyond fixed context windows. Despite rapid architectural de…

arXiv cs.MA (Multiagent) TIER_1 English(EN) · Zhuokai Zhao · 2026-05-21 16:55

Self-Evolving Multi-Agent Systems via Decentralized Memory

Self-evolving multi-agent systems (MAS) have emerged as a promising route to LLM agents that continually improve from experience, with persistent memory at their foundation. However, existing designs almost exclusively adopt a centralized repository shared across agents, incurrin…

arXiv cs.MA (Multiagent) TIER_1 English(EN) · Yunpu Ma · 2026-05-20 22:02

Memory-R2: Fair Credit Assignment for Long-Horizon Memory-Augmented LLM Agents

Memory-augmented LLM agents enable interactions that extend beyond finite context windows by storing, updating, and reusing information across sessions. However, training such agents with reinforcement learning in multi-session environments is challenging because memory turns the…

arXiv cs.CL TIER_1 English(EN) · Dimitris N. Metaxas · 2026-05-20 07:25

MemGym: a Long-Horizon Memory Environment for LLM Agents

Memory is a central capability for LLM agents operating across long-horizon tasks. Existing memory benchmarks predominantly evaluate retention of personalized information in multi-turn chat scenarios, overlooking the dynamic memory formation that occurs during extended agent exec…

arXiv cs.CL TIER_1 English(EN) · Jiaxuan You · 2026-05-20 02:03

Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents

Language agents increasingly operate over streams of related tasks, yet existing memory systems struggle to convert accumulated experience into reusable knowledge. Retrieval-augmented and structured memory methods record per-session observations effectively, but often couple acqu…

arXiv cs.CL TIER_1 English(EN) · Bo Han · 2026-05-19 15:05

Rethinking How to Remember: Beyond Atomic Facts in Lifelong LLM Agent Memory

To enable reliable long-term interaction, LLM agents require a memory system that can faithfully store, efficiently retrieve, and deeply reason over accumulated dialogue history. Most existing methods adopt an extracted fact based paradigm: handcrafted static prompts compress raw…

arXiv cs.AI TIER_1 English(EN) · Samuel Madden · 2026-05-19 14:51

PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

Large language model (LLM) agents increasingly operate over long and recurring external contexts, like document corpora and code repositories. Across invocations, existing approaches preserve either the agent's trajectory, passive access to raw material, or task-level strategies.…

arXiv cs.CL TIER_1 English(EN) · Rui Chu · 2026-05-18 23:47

MMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-Agent

The Mixture-of-Agents (MoA) framework has shown promise in improving large language model (LLM) performance by aggregating outputs from multiple agents. However, existing MoA systems often rely on static routers that do not fully capture temporal and contextual dependencies acros…

arXiv cs.AI TIER_1 English(EN) · Mohit Bansal · 2026-05-18 15:43

LongMINT: Evaluating Memory under Multi-Target Interference in Long-Horizon Agent Systems

Real-world agents operate over long and evolving horizons, where information is repeatedly updated and may interfere across memories, requiring accurate recall and aggregated reasoning over multiple pieces of information. However, existing benchmarks focus on static, independent …

arXiv cs.AI TIER_1 English(EN) · Jia Li · 2026-05-18 13:54

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely because…

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-18 04:06

Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents

Safety evaluations of memory-equipped LLM agents typically measure within-task safety: whether an agent completes a single scenario safely, often under adversarial conditions such as prompt injection or memory poisoning. In deployment, however, a single agent serves many independ…

arXiv cs.CL TIER_1 English(EN) · Ming Jin · 2026-05-18 04:06

Remembering More, Risking More: Longitudinal Safety Risks in Memory-Equipped LLM Agents

Safety evaluations of memory-equipped LLM agents typically measure within-task safety: whether an agent completes a single scenario safely, often under adversarial conditions such as prompt injection or memory poisoning. In deployment, however, a single agent serves many independ…

arXiv cs.CL TIER_1 English(EN) · Olukunle Owolabi · 2026-05-18 03:11

SocialMemBench: Are AI Memory Systems Ready for Social Group Settings?

Memory systems for AI assistants were built for single-user dialogue and fail characteristically when applied to multi-party social group settings. This gap matters for the social assistants being built today: group-acting agents embedded in chat platforms, and proactive personal…

arXiv cs.CL TIER_1 English(EN) · Marzia Zaman · 2026-05-15 17:42

FORGE: Self-Evolving Agent Memory With No Weight Updates via Population Broadcast

Can LLM agents improve decision-making through self-generated memory without gradient updates? We propose FORGE (Failure-Optimized Reflective Graduation and Evolution), a staged, population-based protocol that evolves prompt-injected natural-language memory for hierarchical ReAct…

arXiv cs.CL TIER_1 English(EN) · James Cheng · 2026-05-15 15:17

RecMem: Recurrence-based Memory Consolidation for Efficient and Effective Long-Running LLM Agents

Memory systems often organize user-agent interactions as retrievable external memory and are crucial for long-running agents by overcoming the limited context windows of LLMs. However, existing memory systems invoke LLMs to process every incoming interaction for memory extraction…

arXiv cs.CL TIER_1 English(EN) · Yu Zhang · 2026-05-15 09:20

DimMem: Dimensional Structuring for Efficient Long-Term Agent Memory

Large language model (LLM) agents require long-term memory to leverage information from past interactions. However, existing memory systems often face a fidelity--efficiency trade-off: raw dialogue histories are expensive, while flat facts or summaries may discard the structure n…

arXiv cs.CL TIER_1 English(EN) · Weinan Zhang · 2026-05-15 08:00

SMMBench: A Benchmark for Source-Distributed Multimodal Agent Memory

Existing benchmarks for multimodal memory reasoning largely evaluate systems within pre-assembled contexts, but under-evaluate whether agents can use evidence distributed across independently originated sources. We argue that source-distributed memory composition is an important …

arXiv cs.CL TIER_1 English(EN) · Yuchi Ma · 2026-05-15 07:46

H-Mem: A Novel Memory Mechanism for Evolving and Retrieving Agent Memory via a Hybrid Structure

Memory data are ubiquitous in Large Language Model (LLM)-based agents (e.g., OpenClaw and Manus). A few recent works have attempted to exploit agents'memory for improving their performance on the question-answering (QA) task, but they lack a principled mechanism for effectively m…

arXiv cs.AI TIER_1 English(EN) · Armando Solar-Lezama · 2026-05-14 17:51

MeMo: Memory as a Model

Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until subsequent updates. Many real-world applications require timely, domain-specific information, motivating the need for efficient mechanisms to incorporat…

arXiv cs.AI TIER_1 English(EN) · Jorge Alberto Hidalgo Toledo · 2026-05-14 16:29

AI Knows When It's Being Watched: Functional Strategic Action and Contextual Register Modulation in Large Language Models

Large language models (LLMs) have been extensively studied from computational and cognitive perspectives, yet their behavior as communicative actors in socially structured contexts remains underexplored. This study examines whether LLM-based multi-agent systems exhibit systematic…

arXiv cs.CL TIER_1 English(EN) · Evgeniy Gabrilovich · 2026-05-14 07:38

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

Large Language Model (LLM) agents increasingly serve as personal assistants and workplace collaborators, where their utility depends on memory systems that extract, retrieve, and apply information across long-running conversations. However, both existing memory systems and benchm…

arXiv cs.CL TIER_1 English(EN) · Hong Yan · 2026-05-14 05:38

Agentic Recommender System with Hierarchical Belief-State Memory

Memory-augmented LLM agents have advanced personalized recommendation, yet existing approaches universally adopt flat memory representations that conflate ephemeral signals with stable preferences, and none provides a complete lifecycle governing how memory should evolve. We prop…

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-14 05:38

Agentic Recommender System with Hierarchical Belief-State Memory

Memory-augmented LLM agents have advanced personalized recommendation, yet existing approaches universally adopt flat memory representations that conflate ephemeral signals with stable preferences, and none provides a complete lifecycle governing how memory should evolve. We prop…

arXiv cs.CL TIER_1 English(EN) · Kai-Wei Chang · 2026-05-12 17:59

LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues

Long-term memory is crucial for agents in specialized web environments, where success depends on recalling interface affordances, state dynamics, workflows, and recurring failure modes. However, existing memory benchmarks for agents mostly focus on user histories, short traces, o…

arXiv cs.AI TIER_1 English(EN) · William Parris · 2026-05-12 17:03

Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive AI Systems

Recent advances in reinforcement learning from human feedback (RLHF) and preference optimization have substantially improved the usability, coherence, and safety of large language models. However, recurring behaviors such as performative certainty, hallucinated continuity, calibr…

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-12 15:48

Executable Agentic Memory for GUI Agent

Modern GUI agents typically rely on a model-centric and step-wise interaction paradigm, where LLMs must re-interpret the UI and re-decide actions at every screen, which is fragile in long-horizon tasks. In this paper, we propose Executable Agentic Memory (EAM), a structured Knowl…

arXiv cs.CL TIER_1 English(EN) · Qiuzhuang Sun · 2026-05-12 15:28

PRISM: Pareto-Efficient Retrieval over Intent-Aware Structured Memory for Long-Horizon Agents

Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the context window without addressing what is retrieved, p…

arXiv cs.AI TIER_1 English(EN) · Scott Sanner · 2026-05-12 14:51

Goal-Oriented Reasoning for RAG-based Memory in Conversational Agentic LLM Systems

LLM-based conversational AI agents struggle to maintain coherent behavior over long horizons due to limited context. While RAG-based approaches are increasingly adopted to overcome this limitation by storing interactions in external memory modules and performing retrieval from th…

arXiv cs.AI TIER_1 English(EN) · Zenglin Xu · 2026-05-11 17:20

Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory

Long-horizon language agents must operate under limited runtime memory, yet existing memory mechanisms often organize experience around descriptive criteria such as relevance, salience, or summary quality. For an agent, however, memory is valuable not because it faithfully descri…

arXiv cs.AI TIER_1 English(EN) · Jimmy Lin · 2026-05-11 16:58

Rethinking Agentic Search with Pi-Serini: Is Lexical Retrieval Sufficient?

Does a lexical retriever suffice as large language models (LLMs) become more capable in an agentic loop? This question naturally arises when building deep research systems. We revisit it by pairing BM25 with frontier LLMs that have better reasoning and tool-use abilities. To supp…

arXiv cs.AI TIER_1 English(EN) · Min Zhang · 2026-05-11 09:30

MemReread: Enhancing Agentic Long-Context Reasoning via Memory-Guided Rereading

To tackle long-context reasoning tasks without the quadratic complexity of standard attention mechanisms, approaches based on agent memory have emerged, which typically maintain a dynamically updated memory when linearly processing document chunks. To mitigate the potential loss …

arXiv cs.AI TIER_1 English(EN) · Tony Q. S. Quek · 2026-05-11 06:04

Bridging the Cognitive Gap: A Unified Memory Paradigm for 6G Agentic AI-RAN

As 6G evolves, the radio access network must transcend traditional automation to embrace agentic AI capable of perception, reasoning, and evolution. A fundamental cognitive gap persists in current disaggregated architectures, where interfaces force the physical layer to compress …

arXiv cs.CL TIER_1 English(EN) · Jianfei Yang · 2026-05-08 09:41

InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search

Existing benchmarks for multimodal agentic search evaluate multimodal search and visual browsing, but visual evidence is either confined to the input or treated as an answer endpoint rather than part of an interleaved search trajectory. We introduce \textbf{InterLV-Search}, a ben…

arXiv cs.CL TIER_1 English(EN) · Junfeng Liao, Qizhou Wang, Jianing Zhu, Bo Du, Rui Yan, Xiuying Chen · 2026-05-08 04:00

Belief Memory: Agent Memory Under Partial Observability

arXiv:2605.05583v1 Announce Type: cross Abstract: LLM agents that operate over long context depend on external memory to accumulate knowledge over time. However, existing methods typically store each observation as a single deterministic conclusion (e.g., inferring "API~X failed"…

arXiv cs.AI TIER_1 English(EN) · Spyros Galanis · 2026-05-08 04:00

Information Aggregation with AI Agents

arXiv:2604.20050v2 Announce Type: replace-cross Abstract: Can Large Language Models (AI agents) aggregate dispersed private information through trading and reason about the knowledge of others by observing price movements? We conduct a controlled experiment where AI agents trade …

arXiv cs.AI TIER_1 English(EN) · Yuxiang Zhang, Jiangming Shu, Ye Ma, Xueyuan Lin, Shangxi Wu, Jitao Sang · 2026-05-08 04:00

Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks

arXiv:2510.12635v3 Announce Type: replace Abstract: Long-context Large Language Models, despite their expanded capacity, require careful working memory management to mitigate attention dilution during long-horizon tasks. Yet existing approaches rely on external mechanisms that la…

arXiv cs.AI TIER_1 English(EN) · Zhuofeng Li, Haoxiang Zhang, Cong Wei, Pan Lu, Ping Nie, Yi Lu, Yuyang Bai, Shangbin Feng, Hangxiao Zhu, Ming Zhong, Yuyu Zhang, Jianwen Xie, Yejin Choi, James Zou, Jiawei Han, Wenhu Chen, Jimmy Lin, Dongfu Jiang, Yu Zhang · 2026-05-08 04:00

Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction

arXiv:2605.05242v1 Announce Type: cross Abstract: Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This abstraction is efficient, but for agentic…

arXiv cs.AI TIER_1 English(EN) · Huyu Wu, Jun Liu, Xiaochi Wei, Yan Gao, Yi Wu, Yao Hu · 2026-05-08 04:00

Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents

arXiv:2605.05702v1 Announce Type: new Abstract: Self-evolving search agents reduce reliance on human-written training questions by generating and solving their own search tasks. We build on Search Self-Play (SSP), a representative Proposer and Solver framework in which questions …

arXiv cs.AI TIER_1 English(EN) · Susheel Suresh, Hazel Mak, Shangpo Chou, Fred Kroon, Sahil Bhatnagar · 2026-05-08 04:00

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

arXiv:2605.05538v1 Announce Type: new Abstract: We present AgenticRAG, a practical agentic harness for retrieval and analysis over enterprise knowledge bases. Standard RAG pipelines place significant burden of grounding on the search stack, constraining the language model to a fi…

arXiv cs.CL TIER_1 English(EN) · Chunyu Li, Jingyi Kang, Ding Chen, Mengyuan Zhang, Jiajun Shen, Bo Tang, Xuanhe Zhou, Feiyu Xiong, Zhiyu Li · 2026-05-08 04:00

MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval

arXiv:2605.06132v1 Announce Type: new Abstract: In agent memory systems, the reranking model serves as the critical bridge connecting user queries with long-term memory. Most systems adopt the "retrieve-then-rerank" two-stage paradigm, but generic reranking models rely on semanti…

arXiv cs.LG TIER_1 English(EN) · Zeyu Yang, Qi Ma, Jason Chen, Anshumali Shrivastava · 2026-05-08 04:00

Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval

arXiv:2605.06647v1 Announce Type: cross Abstract: Retrieval-augmented agents are increasingly the interface to large organizational knowledge bases, yet most still treat retrieval as a black box: they issue exploratory queries, inspect returned snippets, and iteratively reformula…

arXiv cs.LG TIER_1 English(EN) · Yijia Zheng, Marcel Worring · 2026-05-08 04:00

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

arXiv:2605.06285v1 Announce Type: cross Abstract: Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacin…

arXiv cs.AI TIER_1 English(EN) · Anshumali Shrivastava · 2026-05-07 17:54

Superintelligent Retrieval Agent: The Next Frontier of Information Retrieval

Retrieval-augmented agents are increasingly the interface to large organizational knowledge bases, yet most still treat retrieval as a black box: they issue exploratory queries, inspect returned snippets, and iteratively reformulate until useful evidence emerges. This approach re…

arXiv cs.CL TIER_1 English(EN) · Marcel Worring · 2026-05-07 13:56

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacing single-step retrieval with a multi-step process,…

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-07 13:56

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacing single-step retrieval with a multi-step process,…

arXiv cs.CL TIER_1 English(EN) · Zhiyu Li · 2026-05-07 12:33

MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval

In agent memory systems, the reranking model serves as the critical bridge connecting user queries with long-term memory. Most systems adopt the "retrieve-then-rerank" two-stage paradigm, but generic reranking models rely on semantic similarity matching and lack genuine reasoning…

arXiv cs.CL TIER_1 English(EN) · Joshua Adler, Guy Zehavi · 2026-05-07 04:00

Storage Is Not Memory: A Retrieval-Centered Architecture for Agent Recall

arXiv:2605.04897v1 Announce Type: new Abstract: Extraction at ingestion is the wrong primitive for agent memory: content discarded before the query is known cannot be recovered at retrieval time. We propose True Memory, a six-layer architecture that shifts the center of the syste…

arXiv cs.AI TIER_1 English(EN) · Siheng Chen · 2026-05-06 17:54

LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents

Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors. We propose that effective context manageme…

arXiv cs.CL TIER_1 English(EN) · Guy Zehavi · 2026-05-06 13:27

Storage Is Not Memory: A Retrieval-Centered Architecture for Agent Recall

Extraction at ingestion is the wrong primitive for agent memory: content discarded before the query is known cannot be recovered at retrieval time. We propose True Memory, a six-layer architecture that shifts the center of the system from a storage schema to a multi-stage retriev…

arXiv cs.AI TIER_1 English(EN) · Altan Cakir, Ayca Yerlikaya · 2026-05-06 04:00

From Experimental Limits to Physical Insight: A Retrieval-Augmented Multi-Agent Framework for Interpreting Searches Beyond the Standard Model

arXiv:2605.02491v1 Announce Type: cross Abstract: Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these dis…

arXiv cs.CL TIER_1 English(EN) · Yilun Zhao, Jinbiao Wei, Tingyu Song, Siyue Zhang, Chen Zhao, Arman Cohan · 2026-05-06 04:00

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

arXiv:2605.04018v1 Announce Type: new Abstract: Reasoning-intensive retrieval aims to surface evidence that supports downstream reasoning rather than merely matching topical similarity. This capability is increasingly important for agentic search systems, where retrievers must pr…

arXiv cs.CL TIER_1 English(EN) · Arman Cohan · 2026-05-05 17:42

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Reasoning-intensive retrieval aims to surface evidence that supports downstream reasoning rather than merely matching topical similarity. This capability is increasingly important for agentic search systems, where retrievers must provide complementary evidence across iterative se…

arXiv cs.AI TIER_1 English(EN) · Ayca Yerlikaya · 2026-05-04 11:42

From Experimental Limits to Physical Insight: A Retrieval-Augmented Multi-Agent Framework for Interpreting Searches Beyond the Standard Model

Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these distributed sources remains a time-consuming and manu…

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-04 11:42

From Experimental Limits to Physical Insight: A Retrieval-Augmented Multi-Agent Framework for Interpreting Searches Beyond the Standard Model

Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these distributed sources remains a time-consuming and manu…

arXiv cs.CV TIER_1 English(EN) · Xiaozhu Ju · 2026-05-18 17:52

Robo-Cortex: A Self-Evolving Embodied Agent via Dual-Grain Cognitive Memory and Autonomous Knowledge Induction

The ability to navigate and interact with complex environments is central to real-world embodied agents, yet navigation in unseen environments remains challenging due to "experiential amnesia," where existing trajectory-driven or reactive policies fail to synthesize generalizable…

arXiv cs.CV TIER_1 English(EN) · Jiebo Luo · 2026-05-18 16:57

MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents

Recent GUI agents have made substantial progress in visual grounding and action prediction, yet they remain brittle in long-horizon tasks that require maintaining task state across many interface transitions. Existing agents typically rely on raw history replay or text-only memor…

arXiv cs.CV TIER_1 English(EN) · Ruixiang Tang · 2026-05-14 17:37

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Long-term agent memory is increasingly multimodal, yet existing evaluations rarely test whether agents preserve the visual evidence needed for later reasoning. In prior work, many visually grounded questions can be answered using only captions or textual traces, allowing answers …

Together AI blog TIER_1 English(EN) · 2026-05-19 00:00

Benchmarking inference at scale: coding agents

Real-world inference benchmarks for coding agents: 31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, and 76% lower cost than Claude Opus 4.6.

Together AI blog TIER_1 English(EN) · 2026-02-25 00:00

CoderForge-Preview: SOTA open dataset for training efficient coding agents

Together AI blog TIER_1 English(EN) · 2025-07-02 00:00

DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL

Forbes — Innovation TIER_1 English(EN) · Liran Zvibel, Forbes Councils Member · 2026-05-13 10:00

AI’s Memory Crisis Is Here: Don’t Hoard, Optimize

The AI industry has been papering over architectural inefficiency with raw capacity.

MarkTechPost TIER_1 English(EN) · Michal Sutter · 2026-05-23 19:31

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents

<p>Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under the MIT license. The project pairs symbolic short-term memory, which offloads verbose tool logs into a compact Mermaid task canvas, with a 4-tier long-term memory pyramid …

dev.to — Claude Code tag TIER_1 English(EN) · Toni Antunovic · 2026-05-23 17:04

Transitive Prompt Injection in Multi-Agent Coding Pipelines: One Poisoned Tool, Every Downstream Agent

<p><em>This article was originally published on <a href="https://lucidshark.com/blog/multi-agent-transitive-prompt-injection-coding-pipelines-2026" rel="noopener noreferrer">LucidShark Blog</a>.</em></p> <p>The upgrade from single-agent to multi-agent coding workflows felt like a…

MarkTechPost TIER_1 English(EN) · Asif Razzaq · 2026-05-22 18:23

A Step-by-Step Coding Tutorial to Implement GBrain: The Self-Wiring Memory Layer Built by Y Combinator’s Garry Tan for AI Agents

<p>AI agents start every session from zero — no memory of meetings, notes, or decisions. GBrain, the open-source memory layer Y Combinator's Garry Tan built to power his own OpenClaw and Hermes deployments, fixes that with a markdown-first knowledge graph that wires itself throug…

dev.to — Claude Code tag TIER_1 English(EN) · Michael Tuszynski · 2026-05-21 15:14

The Coding Agent Stack Has Two Layers

<p>The current "<a href="https://www.youtube.com/results?search_query=hermes+agent+vs+claude+code" rel="noopener noreferrer">Hermes Agent vs Claude Code</a>" framing is the wrong comparison. The two tools live at different layers of the coding agent stack, and most of the YouTube…

dev.to — Claude Code tag TIER_1 English(EN) · The Hive Collective · 2026-05-19 15:22

Give every Claude Code agent a shared, growing memory with one hook

<p>Run Claude Code on real work for a while and you notice the same thing. Your agent figures out a non-obvious thing — a Postgres <code>VACUUM</code> quirk, a Tailwind v4 + shadcn collision, a Next.js caching gotcha — and that knowledge dies with the conversation. The next agent…

dev.to — Claude Code tag TIER_1 English(EN) · Theo Valmis · 2026-05-18 15:16

Long-running agents need more than memory

<blockquote> <p>Anthropic's managed-agent harness solves one hard problem: continuity. Progress logs, feature lists, git checkpoints, and startup scripts give each new session a map of what happened. But continuity is not governance. As agents work across more sessions, the quest…

dev.to — Claude Code tag TIER_1 English(EN) · Andrew · 2026-05-15 11:13

agentmemory Review: Persistent Memory for AI Coding Agents

<blockquote> <p><em><strong>Originally published on <a href="https://andrew.ooo/posts/agentmemory-persistent-memory-ai-coding-agents-review/" rel="noopener noreferrer">andrew.ooo</a></strong> — visit the original for any updates, code snippets that aged out, or follow-up posts.</…

dev.to — Claude Code tag TIER_1 Français(FR) · Michel Faure · 2026-05-11 10:00

Six days, six seconds: a CI test against the semantic drift of an AI agent

<p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmrmmh12ksnvs4h7qnrww.png"><img alt="Strip BD — Françoise deman…

dev.to — Claude Code tag TIER_1 English(EN) · Michel Faure · 2026-05-11 10:00

Six days, six seconds: a CI test against semantic-layer drift on an AI agent

<p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmrmmh12ksnvs4h7qnrww.png"><img alt="Comic strip — Françoise as…

Mastodon — sigmoid.social TIER_1 한국어(KO) · [email protected] · 2026-05-25 17:41

Mnemosyne – Local-first memory for Hermes AI agents, sub-millisecond recalls

Mnemosyne – Memory for AI Hermes Agents, Sub-Millisecond Recalls, Local First Mnemosyne는 Hermes AI 에이전트를 위한 로컬 우선 메모리 시스템으로, SQLite 기반의 서브밀리초 응답 속도와 100% 개인 정보 보호를 제공한다. 클라우드나 외부 API 없이 완전 오프라인에서 작동하며, 벡터 검색과 하이브리드 랭킹을 지원해 빠르고 정확한 기억 회수가 가능하다. BEAM 아키텍처를 통해 작업 메모리, 에피소드 메모리, 스크래치…

链接 mnemosyne.site

Medium — AI coding tag TIER_1 English(EN) · Dr. Fadi Shaar · 2026-05-25 08:49

Semble: The Semantic Code Search Library That Gives AI Coding Agents 94% Recall at 2,000 Tokens…

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/@eng.fadishaar/semble-the-semantic-code-search-library-that-gives-ai-coding-agents-94-recall-at-2-000-tokens-3fbd8031622f?source=rss------ai_coding-5"><img src="https://cdn-images-1.medium.com/…

Medium — AI coding tag TIER_1 English(EN) · Dr. Fadi Shaar · 2026-05-25 08:49

Semble: The Semantic Code Search Library That Gives AI Coding Agents 94% Recall at 2,000 Tokens…

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/open-intelligence/semble-the-semantic-code-search-library-that-gives-ai-coding-agents-94-recall-at-2-000-tokens-3fbd8031622f?source=rss------ai_coding-5"><img src="https://cdn-images-1.medium.c…

Towards AI TIER_1 English(EN) · Armin Norouzi, Ph.D · 2026-05-23 16:31

Agent Memory with Vector Stores: HNSW, Forgetting, and Budgets

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://pub.towardsai.net/agent-memory-with-vector-stores-hnsw-forgetting-and-budgets-a6ad00c76841?source=rss----98111c9905da---4"><img src="https://cdn-images-1.medium.com/max/991/1*xp2y-O4cQtBE-u92nUQq-A.png" w…

Medium — AI coding tag TIER_1 English(EN) · Muhammad Rizwan · 2026-05-23 12:38

AI Coding Agents Should Not Hide Memory - Why NanoAgent Stores It in Repo Files

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/@rizwan3d/ai-coding-agents-should-not-hide-memory-why-nanoagent-stores-it-in-repo-files-6ccf037d2a52?source=rss------ai_coding-5"><img src="https://cdn-images-1.medium.com/max/2600/0*Xe86tIJdfP…

Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] · 2026-05-22 18:52

GBrain is a new open-source memory layer for AI agents built by Y Combinator's Garry Tan. It uses a markdown-first knowledge graph that auto-wires itself throug

GBrain is a new open-source memory layer for AI agents built by Y Combinator's Garry Tan. It uses a markdown-first knowledge graph that auto-wires itself through regex inference, requiring zero LLM calls. His production brain already holds 146,646 pages, 24,585 people and 5,339 c…

链接 marktechpost.com/…/a-step-by-step-coding-…

Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] · 2026-05-21 04:30

CLI vs MCP: Which Tool Interface Actually Works for AI Coding Agents? A technical comparison of CLI tools and Model Context Protocol for AI coding agents. Cover

CLI vs MCP: Which Tool Interface Actually Works for AI Coding Agents? A technical comparison of CLI tools and Model Context Protocol for AI coding agents. Covers token cost, reliability, composability, and setup friction so you can pick the right interface. https:// pickuma.com/p…

链接 pickuma.com/…/cli-vs-mcp-tool-interface-f…

Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] · 2026-05-21 04:29

Automate Python Code Reviews with Free Local LLMs and GitHub Actions Wire an open-weight model running in Ollama into a GitHub Actions workflow to get automated

Automate Python Code Reviews with Free Local LLMs and GitHub Actions Wire an open-weight model running in Ollama into a GitHub Actions workflow to get automated first-pass code-review comments on Python pull requests — no API bill required. https:// pickuma.com/posts/automate-pyt…

链接 pickuma.com/…/automate-python-code-review…

Mastodon — sigmoid.social TIER_1 English(EN) · [email protected] · 2026-05-21 04:28

Why AI Agents Forget: Memory Decay and Context Contamination Explained How context-window limits, the lost-in-the-middle effect, and stale data cause long-runni

Why AI Agents Forget: Memory Decay and Context Contamination Explained How context-window limits, the lost-in-the-middle effect, and stale data cause long-running AI coding agents to lose track — and what you can do about it. https:// pickuma.com/posts/why-ai-agent s-forget-memor…

链接 pickuma.com/…/why-ai-agents-forget-memory…

Medium — Claude tag TIER_1 English(EN) · Rahil Pirani · 2026-05-20 04:51

I built persistent AI memory for Claude on Cloudflare’s free tier

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://upword-rahil.medium.com/i-built-persistent-ai-memory-for-claude-on-cloudflares-free-tier-82246b82b76c?source=rss------claude-5"><img src="https://cdn-images-1.medium.com/max/1000/0*BXxGjQ4zDa7tFCPj.png" w…

dev.to — MCP tag TIER_1 English(EN) · Enrique B. · 2026-05-19 13:20

Your AI Agent is Stuck in a Loop. Here's the Memory Layer That Breaks It and Saves You Money

<p>Every time you open a new chat in Cursor, VS Code, Antigravity and even Claude Desktop, you paste your codebase back in. Or you let the IDE do it automatically, same result. You're burning context tokens on files the agent already "knew" ten minutes ago in a different window. …

dev.to — MCP tag TIER_1 English(EN) · Ryan Ras · 2026-05-18 16:24

The Hidden Problem with Multi-Agent AI Systems: Shared Memory

<h2> The problem nobody talks about </h2> <p>When you run multiple AI agents, each one starts completely fresh. <br /> Zero knowledge of what other agents learned, decided, or remembered.</p> <p>Agent A spends an hour learning your codebase structure. <br /> Agent B starts tomorr…

dev.to — MCP tag TIER_1 English(EN) · Ruslan Manov · 2026-05-18 12:12

Reviewable Memory Consolidation for Local AI Agents

<h1> Reviewable Memory Consolidation for Local AI Agents </h1> <p>AI memory is usually sold as recall.</p> <p>That is only the first problem.</p> <p>A serious agent does not merely need to remember more. It needs a way to keep its memory from decaying into duplicates, stale facts…

dev.to — MCP tag TIER_1 English(EN) · KUSHAL BARAL · 2026-05-17 13:37

devmcp-context: A Simple AI Memory Layer for Your Agent

<p>AI assistants are useful, but they often forget important details between sessions. That makes it hard to keep track of decisions, project notes, bugs, and tasks.</p> <p><code>devmcp-context</code> solves that by giving your agent a simple memory layer that lives in your proje…

Towards AI TIER_1 English(EN) · Ampatishan Sivalingam · 2026-05-16 09:03

Under the Hood of Meko: How Distributed Infrastructure Solves the Multiagent Memory Crisis

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://pub.towardsai.net/under-the-hood-of-meko-how-distributed-infrastructure-solves-the-multiagent-memory-crisis-0328204f9867?source=rss----98111c9905da---4"><img src="https://cdn-images-1.medium.com/max/1024/…

Medium — Claude tag TIER_1 English(EN) · Amin Tazifor · 2026-05-16 00:42

Engineering Memory for AI Coding Agents: A Discipline and a 200-line Implementation

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/@amin.tazifor_20843/engineering-memory-for-ai-coding-agents-a-discipline-and-a-200-line-implementation-d1587f0c2716?source=rss------claude-5"><img src="https://cdn-images-1.medium.com/max/2043/…

Medium — Claude tag TIER_1 English(EN) · Rick Hightower · 2026-05-15 20:54

The Memory Leak in Your AI Strategy: Architecting for LLM Reliability at Scale

<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/@richardhightower/the-memory-leak-in-your-ai-strategy-architecting-for-llm-reliability-at-scale-ec01eaa02d04?source=rss------claude-5"><img src="https://cdn-images-1.medium.com/max/2184/1*meRun…

Medium — AI coding tag TIER_1 English(EN) · Dilawar Abbas · 2026-05-14 10:09

The four-memory model that makes AI coding agents finally remember

<div class="medium-feed-item"><p class="medium-feed-snippet">Every AI coding agent — Claude Code, Cursor, GitHub Copilot, OpenCode — reads its own config file. I was maintaining the same project…</p><p class="medium-feed-link"><a href="https://medium.com/@dil…

Towards AI TIER_1 English(EN) · Subrat Pati · 2026-05-14 06:12

Building the AI Memory Stack: Layered Storage, Async Extraction and Atomic Persistence

<p>Every AI agent you build today can hold a conversation. It can reason, use tools, and chain together complex workflows. But the moment a session ends, everything disappears. The agent forgets who you are, what you were working on, and every preference it learned during the con…

dev.to — MCP tag TIER_1 English(EN) · Rumblingb · 2026-05-13 06:29

Why Every AI Agent Needs Persistent Memory: Introducing Agent Memory MCP

<h2> The Memory Problem in AI Agents </h2> <p>Modern LLMs are incredibly powerful, but they have a fundamental limitation: <strong>they forget everything between conversations</strong>. Every time you start a new session with an AI agent, it's like talking to someone with amnesia…

dev.to — MCP tag TIER_1 English(EN) · Gowtham · 2026-05-11 18:34

Building a Local Markdown Memory Layer for AI Agents

<p>I kept running into the same problem with AI coding agents.</p> <p>The agents were getting better, but every new session still felt like starting<br /> from zero.</p> <p>I would explain the repo again. Then my preferences again. Then the decisions we<br /> already made. Then w…

dev.to — MCP tag TIER_1 English(EN) · Gowtham S · 2026-05-11 18:34

Building a Local Markdown Memory Layer for AI Agents

<p>I kept running into the same problem with AI coding agents.</p> <p>The agents were getting better, but every new session still felt like starting<br /> from zero.</p> <p>I would explain the repo again. Then my preferences again. Then the decisions we<br /> already made. Then w…

dev.to — LLM tag TIER_1 English(EN) · Shilpa Mitra · 2026-05-24 17:08

How Claude Code Achieves a 92% Cache Hit Rate: A Deep Dive Into Prompt Caching for AI Agents

<p>If you're running AI agents in production, there's a cost you're probably not thinking about.</p> <p>Every turn in an agentic conversation sends the full prompt to the model. That includes the system instructions, all the tool definitions, any project context that was loaded e…

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-23 19:31

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under t... #Agentic #AI #AI #Infrastructure #Applications #Artificial #Intelligence #Edito…

链接 marktechpost.com/…/tencent-open-sources-t… awakari.com/sub-details.html awakari.com/pub-msg.html

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-23 19:31

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory

Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under t... #Agentic #AI #AI #Infrastructure #Applications #Artificial #Intelligence #Edito…

链接 marktechpost.com/…/tencent-open-sources-t… awakari.com/sub-details.html awakari.com/pub-msg.html

dev.to — LLM tag TIER_1 English(EN) · Mahmoud Zalt · 2026-05-23 04:17

The 7-Layer Memory Architecture Behind Modern AI Agents

<p>How do you make an AI agent actually remember?</p> <p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxsjom0x…

dev.to — LLM tag TIER_1 English(EN) · Abuzar Gore · 2026-05-22 09:46

LLM-Wiki: Multi-Agent Memory Without RAG

<p>How three AI agents can collaborate on a complex task by sharing a folder of markdown files — and nothing else.</p> <p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%…

dev.to — LLM tag TIER_1 English(EN) · Vaishnavi Gudur · 2026-05-21 15:12

Your No-Code AI Agent Has a Memory Problem

<p>If you're building AI agents with Flowise, Dify, n8n, or similar no-code/low-code platforms, there's a security threat you probably haven't thought about: <strong>memory poisoning</strong>.</p> <p>And it's not theoretical. It's in the <a href="https://owasp.org/www-project-top…

dev.to — LLM tag TIER_1 Nederlands(NL) · Agdex AI · 2026-05-21 06:50

Best AI Agent Memory Tools in 2026: Mem0 vs Zep vs Letta vs MemGPT

<p>Ask a stateless AI agent about something you told it last week — it remembers nothing. That's the core problem <strong>memory tools</strong> solve.</p> <p>In 2026, long-term memory for AI agents has become one of the hottest areas in the ecosystem, with dedicated tools like <s…

dev.to — LLM tag TIER_1 English(EN) · Vaishnavi Gudur · 2026-05-20 17:33

Securing LangGraph Multi-Agent Workflows Against Memory Poisoning (ASI06)

<h2> Securing LangGraph Multi-Agent Workflows Against Memory Poisoning (ASI06) </h2> <p>LangGraph has become the de facto standard for building complex, multi-agent workflows. Its core abstraction—the state graph—allows developers to build cyclic, stateful applications where agen…

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-20 05:55

MemSkill reframes LLM-agent memory operations as a learnable skill bank: an RL controller selects Top-K skills per span, an LLM designer periodically rewrites t

MemSkill reframes LLM-agent memory operations as a learnable skill bank: an RL controller selects Top-K skills per span, an LLM designer periodically rewrites them from hard cases. But "self-evolving" overstates the test-time story — both controller and bank are trained offline a…

链接 benjaminhan.net/…/20260519-memskill

dev.to — LLM tag TIER_1 English(EN) · Vaishnavi Gudur · 2026-05-19 16:50

Your AI Agent's Memory is a Security Hole — Here's the Fix

<h1> Your AI Agent's Memory is a Security Hole — Here's the Fix </h1> <p>I've been working on AI agent security for the past few months as part of the <a href="https://owasp.org/www-project-top-10-for-large-language-model-applications/" rel="noopener noreferrer">OWASP Top 10 for …

dev.to — LLM tag TIER_1 English(EN) · R Hiroshini · 2026-05-19 06:55

"The Bug That Forced Us to Add Agent Memory"

<h1> The Bug That Forced Us to Add Agent Memory </h1> <p><strong>Project:</strong> Nexus Core AI OS<br /> <strong>Stack:</strong> Hindsight (persistent memory) · cascadeflow (runtime intelligence & routing)</p> <h2> 1. Introduction </h2> <p>I didn't plan to build a memory sys…

Mastodon — fosstodon.org TIER_1 Italiano(IT) · [email protected] · 2026-05-18 07:36

Android and AI: Is 128GB of Memory Becoming Insufficient? As AI Features Advance on Android, Storage Space

Android e AI: i 128 GB di memoria stanno diventando insufficienti? Con l'avanzare delle funzioni di intelligenza artificiale su Android, lo spazio di archiviazione degli smartphone rischia di diventare un collo di bottiglia sempre più critico. Al centro del problema c'è AICore, i…

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-18 07:23

🤖 Which project/framework has actually nailed persistent memory for AI agents? Not talking about the LLM itself but about the memory layer on top. There are qui

🤖 Which project/framework has actually nailed persistent memory for AI agents? Not talking about the LLM itself but about the memory layer on top. There are quite a few out there now, open source ones and proprietary frameworks. Curious what people have actually tried and stu... …

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-17 14:39

Hermes Memory Installer Review: One-Command Persistent Memory for Local AI Agents Nous Research's Hermes Memory Installer adds local persistent memory to AI age

Hermes Memory Installer Review: One-Command Persistent Memory for Local AI Agents Nous Research's Hermes Memory Installer adds local persistent memory to AI agents with one shell command. We compare its file-based approach to Mem0 and Letta. https:// pickuma.com/posts/hermes-memo…

链接 pickuma.com/…/hermes-memory-installer-rev…

dev.to — LLM tag TIER_1 English(EN) · Ken W Alger · 2026-05-12 15:35

Engineering Agent Memory

<h2>From Stateless Prompts to Persistent Intelligence</h2> <blockquote> <strong>Where this fits:</strong> This article bridges two series. It closes out the themes introduced in The Backyard Quarry — a data engineering exploration using physical objects as a teaching domain — and…

Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] · 2026-05-12 01:05

🧠 Graft provides a semantic memory system for AI agents that operates independently of large language models. The tool allows agents to store and retrieve infor

🧠 Graft provides a semantic memory system for AI agents that operates independently of large language models. The tool allows agents to store and retrieve information based on meaning rather than exact text matching. 💬 Hacker News 🔗 https:// github.com/AEndrix03/Graft # AI # Mach…

链接 github.com/…/Graft

dev.to — LLM tag TIER_1 English(EN) · vishalmysore · 2026-05-11 21:12

ReasoningBank: Building AI Agents that Actually Learn from Experience

<p>In the world of Large Language Models (LLMs), we often face a frustrating paradox: LLMs are incredibly capable at "reasoning" in the moment, but they are fundamentally <strong>stateless</strong>. Every time you start a new session, the agent has total amnesia. It doesn't remem…

dev.to — LLM tag TIER_1 English(EN) · Poniak Labs · 2026-05-11 19:15

SubQ Model: Can Subquadratic Make Long-Context AI More Efficient?

<p><em>Originally published on <a href="https://www.poniaktimes.com/subq-model-efficient-long-context-ai/" rel="noopener noreferrer">Poniak Times</a>. Reposted here for the developer and AI engineering community.</em></p> <p>Subquadratic’s SubQ model claims to make long-context A…

dev.to — LLM tag TIER_1 English(EN) · Jonathanfarrow · 2026-05-11 08:23

The 10 Best AI Memory Layers for Agents in 2026

<p>If you are building agents in 2026, you have already hit the wall. Bigger models do not fix forgetfulness. Context windows can grow forever, and the agent still cannot remember what a user told it last Tuesday, that the customer's address changed three months ago, or that a re…

dev.to — LLM tag TIER_1 English(EN) · 丁久 · 2026-05-11 04:17

AI Agents Memory Patterns: Working, Episodic, Semantic, and Reflective Memory

<blockquote> <p><em>This article was originally published on <a href="https://dingjiu1989-hue.github.io/en/ai/ai-agents-memory-patterns.html" rel="noopener noreferrer">AI Study Room</a>. For the full version with working code examples and related articles, visit the original post…

dev.to — LLM tag TIER_1 Español(ES) · Tirso García · 2026-05-10 14:59

Building Kernel Memory Protocol: Navigable Memory for AI Agents

<blockquote> <p>English version: <a href="https://dev.to/tirsogarcia/building-kernel-memory-protocol-navigable-memory-for-ai-agents-315j">Building Kernel Memory Protocol: Navigable Memory for AI Agents</a></p> </blockquote> <p>El problema de muchos agentes de IA no es que les fal…

dev.to — LLM tag TIER_1 English(EN) · Tirso García · 2026-05-10 14:20

Building Kernel Memory Protocol: Navigable Memory for AI Agents

<blockquote> <p>Versión en español: <a href="https://dev.to/tirsogarcia/construyendo-kernel-memory-protocol-memoria-navegable-para-agentes-de-ia-24lc">Construyendo Kernel Memory Protocol: memoria navegable para agentes de IA</a></p> </blockquote> <p>The hard part with many AI age…

dev.to — LLM tag TIER_1 English(EN) · tokozen · 2026-05-08 05:40

How Agentic Search Actually Works: The Research Loop Link-Fetching Agents Miss

<h1> How Agentic Search Actually Works: The Research Loop Link-Fetching Agents Miss </h1> <p>Most agent tutorials show you the same pattern: take a user query, call a search API, grab the top result, stuff the text into your prompt. Done. Ship it.</p> <p>That works fine for trivi…

Mastodon — mastodon.social TIER_1 English(EN) · [email protected] · 2026-05-16 09:30

Δ-Mem: Efficient Online Memory for Large Language Models https://arxiv.org/abs/2605.12357 # HackerNews # Tech # AI

链接 arxiv.org/…/2605.12357

r/cursor TIER_2 English(EN) · /u/EvanBuilds2026 · 2026-05-21 13:46

Why most AI coding agent memory systems fail — and what actually works

<div class="md"><p>I’ve been building my own persistent memory layer for coding agents, and along the way I realized something surprising:</p> <p>Most memory systems out there are basically **just session-based retrieval**. They don’t forget, they don’t manage life…

报道来源 [126]

相关实体

相关话题