IBM and UC Berkeley identify causes of enterprise AI agent failures

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

IBM Research and UC Berkeley have developed IT-Bench, a new benchmark designed to evaluate the performance of enterprise AI agents. They also introduced MAST, a framework for diagnosing the root causes of agent failures. This work aims to improve the reliability and effectiveness of AI agents in business environments by identifying specific areas where they struggle. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON The release of a new benchmark and diagnostic framework for AI agents constitutes a research contribution.

Read on Hugging Face Blog →

paper
product
other

IBM and UC Berkeley identify causes of enterprise AI agent failures

COVERAGE [1]

Hugging Face Blog TIER_1 · 2026-02-18 16:15

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

COVERAGE [1]

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

RELATED TOPICS