Query-efficient model evaluation using cached responses
Researchers have developed a new method to make evaluating AI models more efficient by leveraging cached responses from previously tested models. This approach, based on the Data Kernel Perspective Space (DKPS), can predict benchmark performance with fewer queries than traditional methods. The DKPS method is theoretically shown to be query-efficient under specific conditions and empirically demonstrated to achieve similar accuracy with a reduced query budget. Additionally, an offline technique is proposed for selecting queries that optimize prediction accuracy on reference models. AI
IMPACT Reduces the computational cost of benchmarking new AI models, potentially accelerating research and development cycles.