SynAE: A Framework for Measuring the Quality of Synthetic Data for Tool-Calling Agent Evaluations
Researchers have developed SynAE, a new framework designed to evaluate the quality of synthetic data used for testing tool-calling AI agents. This framework addresses the challenge of using synthetic data when real-world datasets are insufficient or contain sensitive information. SynAE measures synthetic data across four categories: task instructions and responses, tool calls, final outputs, and downstream evaluation, assessing validity, fidelity, and diversity. AI
IMPACT Provides a standardized method for assessing the reliability of synthetic datasets used in AI agent development and evaluation.