Brief

last 24h

[2/2] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · arXiv cs.AI English(EN) · 4d · [2 sources]

DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA

Researchers have developed DeferMem, a new framework designed to improve question answering for large language model agents dealing with long-term conversational memory. This system separates the process into initial broad candidate retrieval and a subsequent query-conditioned evidence distillation phase. DeferMem utilizes a reinforcement learning algorithm called DistillPO to refine retrieved information into concise, relevant evidence, outperforming existing methods in accuracy and efficiency. AI

IMPACT Improves LLM agent performance in complex, long-context question answering tasks.
RESEARCH · arXiv cs.CL English(EN) · 4d · [2 sources]

What Training Data Teaches RL Memory Agents: An Empirical Study of Curriculum Effects in Memory-Augmented QA

A new study on arXiv explores how different training data curricula impact the performance of reinforcement learning (RL) agents designed to work with large language models (LLMs) and external memory banks. The research found that the composition of training data significantly influences an agent's specialization rather than uniformly boosting performance. A mixed curriculum combining different benchmarks yielded the best overall results, while training on a narrow out-of-domain set specifically improved temporal reasoning skills. AI

IMPACT Demonstrates that curriculum design is a key factor in tailoring AI agent capabilities for specific tasks.

Brief

DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA

What Training Data Teaches RL Memory Agents: An Empirical Study of Curriculum Effects in Memory-Augmented QA