Let’s talk about evals.
OpenAI has released a new episode of its podcast featuring Tejal Patwardhan, who leads the frontier evaluations team. The episode discusses the importance of model evaluations and strategies for measuring progress, especially as benchmarks become saturated or manipulated. Patwardhan shared insights on why she initially underestimated AI models and how her perspective has evolved. AI
IMPACT Discusses methods for evaluating AI models, offering insights into the challenges and importance of accurate measurement in AI development.