New benchmarks and methods tackle AI agent memory limitations
作者PulseAugur 编辑部·[126 个来源]·
Researchers are developing new benchmarks and methods to evaluate and improve the memory capabilities of AI agents. These efforts address limitations in current systems, which struggle with long-term recall, interference between memories, and reasoning over complex, evolving information. New benchmarks like LongMINT, EvoMemBench, and SocialMemBench are being introduced to test agents in more realistic scenarios, including social settings and multimodal data. Additionally, novel memory architectures such as FORGE, RecMem, DimMem, H-Mem, and MeMo are being proposed to enhance efficiency, reduce token costs, and prevent catastrophic forgetting.
AI
影响
Advances in agent memory systems are crucial for developing more capable and reliable AI assistants across diverse applications.
排序理由
Multiple research papers introduce new benchmarks and memory architectures for AI agents.
Following the launch of Qwen3.6-Plus, we are excited to open-source Qwen3.6-35B-A3B — a sparse yet remarkably capable mixture-of-experts (MoE) model with 35 billion total parameters and only 3 billion active parameters. Despite its efficiency, Qwen3.6-35B-A3B delivers outstanding…
arXiv:2605.12260v2 Announce Type: replace Abstract: Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the co…
arXiv cs.CL
TIER_1English(EN)·Alina Shutova, Alexandra Olenina, Ivan Vinogradov, Anton Sinitsin·
arXiv:2602.11243v2 Announce Type: replace-cross Abstract: Modern LLM-based agents and chat assistants rely on long-term memory frameworks to store reusable knowledge, recall user preferences, and augment reasoning. As researchers create more complex memory architectures, it becom…
arXiv cs.CL
TIER_1English(EN)·Jingru Lin, Chen Zhang, Stephen Y. Liu, Haizhou Li·
arXiv:2605.21951v1 Announce Type: new Abstract: Achieving self-evolution in intelligent agents requires the continual accumulation of new knowledge across changing task sequences without forgetting previously acquired abilities. Existing approaches either internalize knowledge by…
arXiv cs.LG
TIER_1English(EN)·Sikuan Yan, Ahmed Bahloul, Ercong Nie, Susanna Schwarzmann, Riccardo Trivisonno, Volker Tresp, Yunpu Ma·
arXiv:2605.21768v1 Announce Type: new Abstract: Memory-augmented LLM agents enable interactions that extend beyond finite context windows by storing, updating, and reusing information across sessions. However, training such agents with reinforcement learning in multi-session envi…
arXiv:2604.15774v2 Announce Type: replace Abstract: Equipping Large Language Models (LLMs) with persistent memory enhances interaction continuity and personalization but introduces new safety risks. Specifically, contaminated or biased memory accumulation can trigger abnormal age…
arXiv:2605.20251v2 Announce Type: cross Abstract: Existing benchmarks for LLM coding agents primarily evaluate final outcomes. While useful for measuring overall capability, these metrics provide limited visibility and often miss defects that arise during execution. We present Pr…
arXiv:2602.06025v2 Announce Type: replace-cross Abstract: Memory is increasingly central to Large Language Model (LLM) agents operating beyond a single context window, yet most existing systems rely on offline, query-agnostic memory construction that can be inefficient and may di…
arXiv:2602.19320v2 Announce Type: replace-cross Abstract: Agentic memory systems enable large language model (LLM) agents to maintain state across long interactions, supporting long-horizon reasoning and personalization beyond fixed context windows. Despite rapid architectural de…
Self-evolving multi-agent systems (MAS) have emerged as a promising route to LLM agents that continually improve from experience, with persistent memory at their foundation. However, existing designs almost exclusively adopt a centralized repository shared across agents, incurrin…
Memory-augmented LLM agents enable interactions that extend beyond finite context windows by storing, updating, and reusing information across sessions. However, training such agents with reinforcement learning in multi-session environments is challenging because memory turns the…
arXiv cs.CL
TIER_1English(EN)·Dimitris N. Metaxas·
Memory is a central capability for LLM agents operating across long-horizon tasks. Existing memory benchmarks predominantly evaluate retention of personalized information in multi-turn chat scenarios, overlooking the dynamic memory formation that occurs during extended agent exec…
Language agents increasingly operate over streams of related tasks, yet existing memory systems struggle to convert accumulated experience into reusable knowledge. Retrieval-augmented and structured memory methods record per-session observations effectively, but often couple acqu…
To enable reliable long-term interaction, LLM agents require a memory system that can faithfully store, efficiently retrieve, and deeply reason over accumulated dialogue history. Most existing methods adopt an extracted fact based paradigm: handcrafted static prompts compress raw…
Large language model (LLM) agents increasingly operate over long and recurring external contexts, like document corpora and code repositories. Across invocations, existing approaches preserve either the agent's trajectory, passive access to raw material, or task-level strategies.…
The Mixture-of-Agents (MoA) framework has shown promise in improving large language model (LLM) performance by aggregating outputs from multiple agents. However, existing MoA systems often rely on static routers that do not fully capture temporal and contextual dependencies acros…
Real-world agents operate over long and evolving horizons, where information is repeatedly updated and may interfere across memories, requiring accurate recall and aggregated reasoning over multiple pieces of information. However, existing benchmarks focus on static, independent …
Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely because…
Safety evaluations of memory-equipped LLM agents typically measure within-task safety: whether an agent completes a single scenario safely, often under adversarial conditions such as prompt injection or memory poisoning. In deployment, however, a single agent serves many independ…
Safety evaluations of memory-equipped LLM agents typically measure within-task safety: whether an agent completes a single scenario safely, often under adversarial conditions such as prompt injection or memory poisoning. In deployment, however, a single agent serves many independ…
Memory systems for AI assistants were built for single-user dialogue and fail characteristically when applied to multi-party social group settings. This gap matters for the social assistants being built today: group-acting agents embedded in chat platforms, and proactive personal…
Can LLM agents improve decision-making through self-generated memory without gradient updates? We propose FORGE (Failure-Optimized Reflective Graduation and Evolution), a staged, population-based protocol that evolves prompt-injected natural-language memory for hierarchical ReAct…
Memory systems often organize user-agent interactions as retrievable external memory and are crucial for long-running agents by overcoming the limited context windows of LLMs. However, existing memory systems invoke LLMs to process every incoming interaction for memory extraction…
Large language model (LLM) agents require long-term memory to leverage information from past interactions. However, existing memory systems often face a fidelity--efficiency trade-off: raw dialogue histories are expensive, while flat facts or summaries may discard the structure n…
Existing benchmarks for multimodal memory reasoning largely evaluate systems within pre-assembled contexts, but under-evaluate whether agents can use evidence distributed across independently originated sources. We argue that source-distributed memory composition is an important …
Memory data are ubiquitous in Large Language Model (LLM)-based agents (e.g., OpenClaw and Manus). A few recent works have attempted to exploit agents'memory for improving their performance on the question-answering (QA) task, but they lack a principled mechanism for effectively m…
Large language models (LLMs) achieve strong performance across a wide range of tasks, but remain frozen after pretraining until subsequent updates. Many real-world applications require timely, domain-specific information, motivating the need for efficient mechanisms to incorporat…
arXiv cs.AI
TIER_1English(EN)·Jorge Alberto Hidalgo Toledo·
Large language models (LLMs) have been extensively studied from computational and cognitive perspectives, yet their behavior as communicative actors in socially structured contexts remains underexplored. This study examines whether LLM-based multi-agent systems exhibit systematic…
Large Language Model (LLM) agents increasingly serve as personal assistants and workplace collaborators, where their utility depends on memory systems that extract, retrieve, and apply information across long-running conversations. However, both existing memory systems and benchm…
Memory-augmented LLM agents have advanced personalized recommendation, yet existing approaches universally adopt flat memory representations that conflate ephemeral signals with stable preferences, and none provides a complete lifecycle governing how memory should evolve. We prop…
Memory-augmented LLM agents have advanced personalized recommendation, yet existing approaches universally adopt flat memory representations that conflate ephemeral signals with stable preferences, and none provides a complete lifecycle governing how memory should evolve. We prop…
Long-term memory is crucial for agents in specialized web environments, where success depends on recalling interface affordances, state dynamics, workflows, and recurring failure modes. However, existing memory benchmarks for agents mostly focus on user histories, short traces, o…
Recent advances in reinforcement learning from human feedback (RLHF) and preference optimization have substantially improved the usability, coherence, and safety of large language models. However, recurring behaviors such as performative certainty, hallucinated continuity, calibr…
Modern GUI agents typically rely on a model-centric and step-wise interaction paradigm, where LLMs must re-interpret the UI and re-decide actions at every screen, which is fragile in long-horizon tasks. In this paper, we propose Executable Agentic Memory (EAM), a structured Knowl…
Long-horizon language agents accumulate conversation history far faster than any fixed context window can hold, making memory management critical to both answer accuracy and serving cost. Existing approaches either expand the context window without addressing what is retrieved, p…
LLM-based conversational AI agents struggle to maintain coherent behavior over long horizons due to limited context. While RAG-based approaches are increasingly adopted to overcome this limitation by storing interactions in external memory modules and performing retrieval from th…
Long-horizon language agents must operate under limited runtime memory, yet existing memory mechanisms often organize experience around descriptive criteria such as relevance, salience, or summary quality. For an agent, however, memory is valuable not because it faithfully descri…
Does a lexical retriever suffice as large language models (LLMs) become more capable in an agentic loop? This question naturally arises when building deep research systems. We revisit it by pairing BM25 with frontier LLMs that have better reasoning and tool-use abilities. To supp…
To tackle long-context reasoning tasks without the quadratic complexity of standard attention mechanisms, approaches based on agent memory have emerged, which typically maintain a dynamically updated memory when linearly processing document chunks. To mitigate the potential loss …
As 6G evolves, the radio access network must transcend traditional automation to embrace agentic AI capable of perception, reasoning, and evolution. A fundamental cognitive gap persists in current disaggregated architectures, where interfaces force the physical layer to compress …
Existing benchmarks for multimodal agentic search evaluate multimodal search and visual browsing, but visual evidence is either confined to the input or treated as an answer endpoint rather than part of an interleaved search trajectory. We introduce \textbf{InterLV-Search}, a ben…
arXiv cs.CL
TIER_1English(EN)·Junfeng Liao, Qizhou Wang, Jianing Zhu, Bo Du, Rui Yan, Xiuying Chen·
arXiv:2605.05583v1 Announce Type: cross Abstract: LLM agents that operate over long context depend on external memory to accumulate knowledge over time. However, existing methods typically store each observation as a single deterministic conclusion (e.g., inferring "API~X failed"…
arXiv:2604.20050v2 Announce Type: replace-cross Abstract: Can Large Language Models (AI agents) aggregate dispersed private information through trading and reason about the knowledge of others by observing price movements? We conduct a controlled experiment where AI agents trade …
arXiv:2510.12635v3 Announce Type: replace Abstract: Long-context Large Language Models, despite their expanded capacity, require careful working memory management to mitigate attention dilution during long-horizon tasks. Yet existing approaches rely on external mechanisms that la…
arXiv cs.AI
TIER_1English(EN)·Zhuofeng Li, Haoxiang Zhang, Cong Wei, Pan Lu, Ping Nie, Yi Lu, Yuyang Bai, Shangbin Feng, Hangxiao Zhu, Ming Zhong, Yuyu Zhang, Jianwen Xie, Yejin Choi, James Zou, Jiawei Han, Wenhu Chen, Jimmy Lin, Dongfu Jiang, Yu Zhang·
arXiv:2605.05242v1 Announce Type: cross Abstract: Modern retrieval systems, whether lexical or semantic, expose a corpus through a fixed similarity interface that compresses access into a single top-k retrieval step before reasoning. This abstraction is efficient, but for agentic…
arXiv cs.AI
TIER_1English(EN)·Huyu Wu, Jun Liu, Xiaochi Wei, Yan Gao, Yi Wu, Yao Hu·
arXiv:2605.05702v1 Announce Type: new Abstract: Self-evolving search agents reduce reliance on human-written training questions by generating and solving their own search tasks. We build on Search Self-Play (SSP), a representative Proposer and Solver framework in which questions …
arXiv:2605.05538v1 Announce Type: new Abstract: We present AgenticRAG, a practical agentic harness for retrieval and analysis over enterprise knowledge bases. Standard RAG pipelines place significant burden of grounding on the search stack, constraining the language model to a fi…
arXiv:2605.06132v1 Announce Type: new Abstract: In agent memory systems, the reranking model serves as the critical bridge connecting user queries with long-term memory. Most systems adopt the "retrieve-then-rerank" two-stage paradigm, but generic reranking models rely on semanti…
arXiv cs.LG
TIER_1English(EN)·Zeyu Yang, Qi Ma, Jason Chen, Anshumali Shrivastava·
arXiv:2605.06647v1 Announce Type: cross Abstract: Retrieval-augmented agents are increasingly the interface to large organizational knowledge bases, yet most still treat retrieval as a black box: they issue exploratory queries, inspect returned snippets, and iteratively reformula…
arXiv:2605.06285v1 Announce Type: cross Abstract: Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacin…
Retrieval-augmented agents are increasingly the interface to large organizational knowledge bases, yet most still treat retrieval as a black box: they issue exploratory queries, inspect returned snippets, and iteratively reformulate until useful evidence emerges. This approach re…
Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacing single-step retrieval with a multi-step process,…
Single-step retrieval-augmented generation (RAG) provides an efficient way to incorporate external information for simple question answering tasks but struggles with complex questions. Agentic RAG extends this paradigm by replacing single-step retrieval with a multi-step process,…
In agent memory systems, the reranking model serves as the critical bridge connecting user queries with long-term memory. Most systems adopt the "retrieve-then-rerank" two-stage paradigm, but generic reranking models rely on semantic similarity matching and lack genuine reasoning…
arXiv cs.CL
TIER_1English(EN)·Joshua Adler, Guy Zehavi·
arXiv:2605.04897v1 Announce Type: new Abstract: Extraction at ingestion is the wrong primitive for agent memory: content discarded before the query is known cannot be recovered at retrieval time. We propose True Memory, a six-layer architecture that shifts the center of the syste…
Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors. We propose that effective context manageme…
Extraction at ingestion is the wrong primitive for agent memory: content discarded before the query is known cannot be recovered at retrieval time. We propose True Memory, a six-layer architecture that shifts the center of the system from a storage schema to a multi-stage retriev…
arXiv:2605.02491v1 Announce Type: cross Abstract: Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these dis…
arXiv:2605.04018v1 Announce Type: new Abstract: Reasoning-intensive retrieval aims to surface evidence that supports downstream reasoning rather than merely matching topical similarity. This capability is increasingly important for agentic search systems, where retrievers must pr…
Reasoning-intensive retrieval aims to surface evidence that supports downstream reasoning rather than merely matching topical similarity. This capability is increasingly important for agentic search systems, where retrievers must provide complementary evidence across iterative se…
Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these distributed sources remains a time-consuming and manu…
Modern searches for physics beyond the Standard Model produce rapidly expanding literature containing heterogeneous information, including textual analyses, numerical datasets, and graphical exclusion limits. Integrating these distributed sources remains a time-consuming and manu…
The ability to navigate and interact with complex environments is central to real-world embodied agents, yet navigation in unseen environments remains challenging due to "experiential amnesia," where existing trajectory-driven or reactive policies fail to synthesize generalizable…
Recent GUI agents have made substantial progress in visual grounding and action prediction, yet they remain brittle in long-horizon tasks that require maintaining task state across many interface transitions. Existing agents typically rely on raw history replay or text-only memor…
Long-term agent memory is increasingly multimodal, yet existing evaluations rarely test whether agents preserve the visual evidence needed for later reasoning. In prior work, many visually grounded questions can be answered using only captions or textual traces, allowing answers …
Real-world inference benchmarks for coding agents: 31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, and 76% lower cost than Claude Opus 4.6.
<p>Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under the MIT license. The project pairs symbolic short-term memory, which offloads verbose tool logs into a compact Mermaid task canvas, with a 4-tier long-term memory pyramid …
dev.to — Claude Code tag
TIER_1English(EN)·Toni Antunovic·
<p><em>This article was originally published on <a href="https://lucidshark.com/blog/multi-agent-transitive-prompt-injection-coding-pipelines-2026" rel="noopener noreferrer">LucidShark Blog</a>.</em></p> <p>The upgrade from single-agent to multi-agent coding workflows felt like a…
<p>AI agents start every session from zero — no memory of meetings, notes, or decisions. GBrain, the open-source memory layer Y Combinator's Garry Tan built to power his own OpenClaw and Hermes deployments, fixes that with a markdown-first knowledge graph that wires itself throug…
dev.to — Claude Code tag
TIER_1English(EN)·Michael Tuszynski·
<p>The current "<a href="https://www.youtube.com/results?search_query=hermes+agent+vs+claude+code" rel="noopener noreferrer">Hermes Agent vs Claude Code</a>" framing is the wrong comparison. The two tools live at different layers of the coding agent stack, and most of the YouTube…
dev.to — Claude Code tag
TIER_1English(EN)·The Hive Collective·
<p>Run Claude Code on real work for a while and you notice the same thing. Your agent figures out a non-obvious thing — a Postgres <code>VACUUM</code> quirk, a Tailwind v4 + shadcn collision, a Next.js caching gotcha — and that knowledge dies with the conversation. The next agent…
dev.to — Claude Code tag
TIER_1English(EN)·Theo Valmis·
<blockquote> <p>Anthropic's managed-agent harness solves one hard problem: continuity. Progress logs, feature lists, git checkpoints, and startup scripts give each new session a map of what happened. But continuity is not governance. As agents work across more sessions, the quest…
dev.to — Claude Code tag
TIER_1English(EN)·Andrew·
<blockquote> <p><em><strong>Originally published on <a href="https://andrew.ooo/posts/agentmemory-persistent-memory-ai-coding-agents-review/" rel="noopener noreferrer">andrew.ooo</a></strong> — visit the original for any updates, code snippets that aged out, or follow-up posts.</…
dev.to — Claude Code tag
TIER_1Français(FR)·Michel Faure·
Mnemosyne – Memory for AI Hermes Agents, Sub-Millisecond Recalls, Local First Mnemosyne는 Hermes AI 에이전트를 위한 로컬 우선 메모리 시스템으로, SQLite 기반의 서브밀리초 응답 속도와 100% 개인 정보 보호를 제공한다. 클라우드나 외부 API 없이 완전 오프라인에서 작동하며, 벡터 검색과 하이브리드 랭킹을 지원해 빠르고 정확한 기억 회수가 가능하다. BEAM 아키텍처를 통해 작업 메모리, 에피소드 메모리, 스크래치…
GBrain is a new open-source memory layer for AI agents built by Y Combinator's Garry Tan. It uses a markdown-first knowledge graph that auto-wires itself through regex inference, requiring zero LLM calls. His production brain already holds 146,646 pages, 24,585 people and 5,339 c…
CLI vs MCP: Which Tool Interface Actually Works for AI Coding Agents? A technical comparison of CLI tools and Model Context Protocol for AI coding agents. Covers token cost, reliability, composability, and setup friction so you can pick the right interface. https:// pickuma.com/p…
Automate Python Code Reviews with Free Local LLMs and GitHub Actions Wire an open-weight model running in Ollama into a GitHub Actions workflow to get automated first-pass code-review comments on Python pull requests — no API bill required. https:// pickuma.com/posts/automate-pyt…
Why AI Agents Forget: Memory Decay and Context Contamination Explained How context-window limits, the lost-in-the-middle effect, and stale data cause long-running AI coding agents to lose track — and what you can do about it. https:// pickuma.com/posts/why-ai-agent s-forget-memor…
<p>Every time you open a new chat in Cursor, VS Code, Antigravity and even Claude Desktop, you paste your codebase back in. Or you let the IDE do it automatically, same result. You're burning context tokens on files the agent already "knew" ten minutes ago in a different window. …
<h2> The problem nobody talks about </h2> <p>When you run multiple AI agents, each one starts completely fresh. <br /> Zero knowledge of what other agents learned, decided, or remembered.</p> <p>Agent A spends an hour learning your codebase structure. <br /> Agent B starts tomorr…
<h1> Reviewable Memory Consolidation for Local AI Agents </h1> <p>AI memory is usually sold as recall.</p> <p>That is only the first problem.</p> <p>A serious agent does not merely need to remember more. It needs a way to keep its memory from decaying into duplicates, stale facts…
<p>AI assistants are useful, but they often forget important details between sessions. That makes it hard to keep track of decisions, project notes, bugs, and tasks.</p> <p><code>devmcp-context</code> solves that by giving your agent a simple memory layer that lives in your proje…
Towards AI
TIER_1English(EN)·Ampatishan Sivalingam·
<div class="medium-feed-item"><p class="medium-feed-snippet">Every AI coding agent — Claude Code, Cursor, GitHub Copilot, OpenCode — reads its own config file. I was maintaining the same project…</p><p class="medium-feed-link"><a href="https://medium.com/@dil…
<p>Every AI agent you build today can hold a conversation. It can reason, use tools, and chain together complex workflows. But the moment a session ends, everything disappears. The agent forgets who you are, what you were working on, and every preference it learned during the con…
<h2> The Memory Problem in AI Agents </h2> <p>Modern LLMs are incredibly powerful, but they have a fundamental limitation: <strong>they forget everything between conversations</strong>. Every time you start a new session with an AI agent, it's like talking to someone with amnesia…
<p>I kept running into the same problem with AI coding agents.</p> <p>The agents were getting better, but every new session still felt like starting<br /> from zero.</p> <p>I would explain the repo again. Then my preferences again. Then the decisions we<br /> already made. Then w…
<p>I kept running into the same problem with AI coding agents.</p> <p>The agents were getting better, but every new session still felt like starting<br /> from zero.</p> <p>I would explain the repo again. Then my preferences again. Then the decisions we<br /> already made. Then w…
<p>If you're running AI agents in production, there's a cost you're probably not thinking about.</p> <p>Every turn in an agentic conversation sends the full prompt to the model. That includes the system instructions, all the tool definitions, any project context that was loaded e…
Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under t... #Agentic #AI #AI #Infrastructure #Applications #Artificial #Intelligence #Edito…
Tencent Open-Sources TencentDB Agent Memory: A 4-Tier Local Memory Pipeline for AI Agents Tencent has open-sourced TencentDB Agent Memory, a fully local memory system for AI agents released under t... #Agentic #AI #AI #Infrastructure #Applications #Artificial #Intelligence #Edito…
<p>How do you make an AI agent actually remember?</p> <p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxsjom0x…
<p>How three AI agents can collaborate on a complex task by sharing a folder of markdown files — and nothing else.</p> <p><a class="article-body-image-wrapper" href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%…
dev.to — LLM tag
TIER_1English(EN)·Vaishnavi Gudur·
<p>If you're building AI agents with Flowise, Dify, n8n, or similar no-code/low-code platforms, there's a security threat you probably haven't thought about: <strong>memory poisoning</strong>.</p> <p>And it's not theoretical. It's in the <a href="https://owasp.org/www-project-top…
<p>Ask a stateless AI agent about something you told it last week — it remembers nothing. That's the core problem <strong>memory tools</strong> solve.</p> <p>In 2026, long-term memory for AI agents has become one of the hottest areas in the ecosystem, with dedicated tools like <s…
dev.to — LLM tag
TIER_1English(EN)·Vaishnavi Gudur·
<h2> Securing LangGraph Multi-Agent Workflows Against Memory Poisoning (ASI06) </h2> <p>LangGraph has become the de facto standard for building complex, multi-agent workflows. Its core abstraction—the state graph—allows developers to build cyclic, stateful applications where agen…
MemSkill reframes LLM-agent memory operations as a learnable skill bank: an RL controller selects Top-K skills per span, an LLM designer periodically rewrites them from hard cases. But "self-evolving" overstates the test-time story — both controller and bank are trained offline a…
<h1> Your AI Agent's Memory is a Security Hole — Here's the Fix </h1> <p>I've been working on AI agent security for the past few months as part of the <a href="https://owasp.org/www-project-top-10-for-large-language-model-applications/" rel="noopener noreferrer">OWASP Top 10 for …
<h1> The Bug That Forced Us to Add Agent Memory </h1> <p><strong>Project:</strong> Nexus Core AI OS<br /> <strong>Stack:</strong> Hindsight (persistent memory) · cascadeflow (runtime intelligence & routing)</p> <h2> 1. Introduction </h2> <p>I didn't plan to build a memory sys…
Android e AI: i 128 GB di memoria stanno diventando insufficienti? Con l'avanzare delle funzioni di intelligenza artificiale su Android, lo spazio di archiviazione degli smartphone rischia di diventare un collo di bottiglia sempre più critico. Al centro del problema c'è AICore, i…
🤖 Which project/framework has actually nailed persistent memory for AI agents? Not talking about the LLM itself but about the memory layer on top. There are quite a few out there now, open source ones and proprietary frameworks. Curious what people have actually tried and stu... …
Hermes Memory Installer Review: One-Command Persistent Memory for Local AI Agents Nous Research's Hermes Memory Installer adds local persistent memory to AI agents with one shell command. We compare its file-based approach to Mem0 and Letta. https:// pickuma.com/posts/hermes-memo…
<h2>From Stateless Prompts to Persistent Intelligence</h2> <blockquote> <strong>Where this fits:</strong> This article bridges two series. It closes out the themes introduced in The Backyard Quarry — a data engineering exploration using physical objects as a teaching domain — and…
🧠 Graft provides a semantic memory system for AI agents that operates independently of large language models. The tool allows agents to store and retrieve information based on meaning rather than exact text matching. 💬 Hacker News 🔗 https:// github.com/AEndrix03/Graft # AI # Mach…
<p>In the world of Large Language Models (LLMs), we often face a frustrating paradox: LLMs are incredibly capable at "reasoning" in the moment, but they are fundamentally <strong>stateless</strong>. Every time you start a new session, the agent has total amnesia. It doesn't remem…
<p><em>Originally published on <a href="https://www.poniaktimes.com/subq-model-efficient-long-context-ai/" rel="noopener noreferrer">Poniak Times</a>. Reposted here for the developer and AI engineering community.</em></p> <p>Subquadratic’s SubQ model claims to make long-context A…
dev.to — LLM tag
TIER_1English(EN)·Jonathanfarrow·
<p>If you are building agents in 2026, you have already hit the wall. Bigger models do not fix forgetfulness. Context windows can grow forever, and the agent still cannot remember what a user told it last Tuesday, that the customer's address changed three months ago, or that a re…
<blockquote> <p><em>This article was originally published on <a href="https://dingjiu1989-hue.github.io/en/ai/ai-agents-memory-patterns.html" rel="noopener noreferrer">AI Study Room</a>. For the full version with working code examples and related articles, visit the original post…
<blockquote> <p>English version: <a href="https://dev.to/tirsogarcia/building-kernel-memory-protocol-navigable-memory-for-ai-agents-315j">Building Kernel Memory Protocol: Navigable Memory for AI Agents</a></p> </blockquote> <p>El problema de muchos agentes de IA no es que les fal…
<blockquote> <p>Versión en español: <a href="https://dev.to/tirsogarcia/construyendo-kernel-memory-protocol-memoria-navegable-para-agentes-de-ia-24lc">Construyendo Kernel Memory Protocol: memoria navegable para agentes de IA</a></p> </blockquote> <p>The hard part with many AI age…
<h1> How Agentic Search Actually Works: The Research Loop Link-Fetching Agents Miss </h1> <p>Most agent tutorials show you the same pattern: take a user query, call a search API, grab the top result, stuff the text into your prompt. Done. Ship it.</p> <p>That works fine for trivi…
<!-- SC_OFF --><div class="md"><p>I’ve been building my own persistent memory layer for coding agents, and along the way I realized something surprising:</p> <p>Most memory systems out there are basically **just session-based retrieval**. They don’t forget, they don’t manage life…