English(EN) Customer-Agent: Overcoming Context Limitations in Ultra-Long Shopping Trajectories via Tool-Augmented Agents and RLVR

AI代理被训练来导航长购物历史

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-06 06:22

研究人员开发了新的方法来训练AI代理理解长客户购物轨迹，这项任务以前受到大型语言模型中上下文窗口限制的制约。一种方法利用Bittensor网络上的“代理竞技场”为购物代理生成多样化、经过评判的训练数据，显著提高了它们在基准测试上的表现。另一种方法引入了一个框架，允许代理通过工具增强的交互自主地从外部文件中检索和解析长轨迹，从而有效地绕过了LLM的上下文限制，并在新的长上下文基准测试中展示了强大的性能。 AI

影响新的训练技术和基准测试可以使AI代理更好地理解和应对复杂、长期的用户行为。

排序理由两篇学术论文详细介绍了在长序列数据上训练AI代理的新颖方法。

在 arXiv cs.CL 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Shardul Bansal, Seth Schilbe, Jarrod Barnes · 2026-06-10 04:00

Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces

arXiv:2606.10064v1 Announce Type: cross Abstract: Small-model agentic post-training is bottlenecked less by the algorithm than by the trajectory substrate it consumes. Leading recipes (RLVR, group-relative RL, rejection-sampled re-SFT) all need multi-turn traces carrying per-traj…
arXiv cs.CL TIER_1 English(EN) · Bing Yin · 2026-06-06 06:22

客户-代理：通过工具增强代理和RLVR克服超长购物轨迹中的上下文限制

Understanding customer shopping trajectories is essential for enabling personalized shopping experiences. However, shopping records (i.e., customer's search, clicks, purchases, etc.) often span long time horizons over multiple years, resulting in extremely long trajectories that …

报道来源 [2]

Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces

客户-代理：通过工具增强代理和RLVR克服超长购物轨迹中的上下文限制

相关实体

相关话题