Measuring Agents in Production
A new study, Measuring Agents in Production (MAP), has analyzed the current state of LLM-based agents deployed across various industries. The research, based on 20 case studies and a survey of 86 practitioners, reveals that most production agents operate with significant human oversight and rely on off-the-shelf models rather than fine-tuning. Reliability is identified as the primary challenge, with developers currently addressing it through system-level design rather than model improvements. AI
IMPACT Highlights current limitations and research gaps in production AI agent deployment, suggesting focus on reliability and system-level design.