Researchers have developed a new framework called Customer-Agent to handle extremely long customer shopping trajectories, which often exceed the context window limitations of current large language models. This framework utilizes a Reinforcement Learning with Verifiable Rewards (RLVR) approach, enabling agents to autonomously retrieve and parse trajectory data through code interpreter interactions. A new benchmark, ShopTrajQA, was also introduced to evaluate model performance on these long-context datasets, with variants up to 64k tokens. AI
IMPACT This research could enable more personalized e-commerce experiences by allowing LLMs to process extensive customer histories.
RANK_REASON The cluster contains an academic paper introducing a new framework and benchmark for LLMs. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →