This tutorial introduces AgentTrove, a large open-source dataset of agentic interaction traces, accessible via streaming to avoid full downloads. It details methods for inspecting conversation schemas, normalizing turns, and parsing agent outputs, including shell commands. The process also covers creating a clean ShareGPT-style dataset for supervised fine-tuning by summarizing statistics and visualizing patterns from thousands of traces. AI
IMPACT Enables researchers to efficiently analyze and fine-tune agent models using a large, accessible dataset.
RANK_REASON The cluster describes a tutorial on using an open-source dataset and associated tools for analysis and fine-tuning, which falls under research and tooling. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →