Ethan Mollick argues that current AI benchmarks are flawed because they are often publicly available, leading to AIs being trained on them, and they don't always measure what they claim to. He suggests that while benchmarks show an overall upward trend in AI capabilities, they lack the nuance to assess specific skills like writing or empathy. Mollick proposes that individuals and organizations should instead AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON This is an opinion piece by a credible voice discussing AI capabilities and evaluation methods.