The Hugging Face Open LLM Leaderboard has updated its evaluation methodology to include the MMLU benchmark, a comprehensive test of language model knowledge across 57 subjects. This change aims to provide a more robust assessment of model capabilities by incorporating a wider range of academic and professional domains. The leaderboard now uses a weighted average of MMLU scores alongside existing benchmarks to rank open-source large language models. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON The Hugging Face Open LLM Leaderboard updated its evaluation methodology to include the MMLU benchmark, which is a common practice in academic research for evaluating LLMs.