OpenAI has introduced GDPval, a new evaluation designed to measure AI model performance on economically valuable, real-world tasks across 44 occupations. This evaluation draws tasks from key industries contributing to U.S. GDP, with each task based on actual work products like legal briefs or engineering blueprints. GDPval aims to provide a more realistic assessment of AI capabilities compared to traditional academic benchmarks, focusing on how models can support professionals in their daily work. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON OpenAI released a new evaluation methodology for AI models.