Hugging Face has updated its Open LLM Leaderboard to incorporate a new evaluation metric called DROP (Discrete Reasoning Over Paragraphs). This addition aims to better assess the reasoning capabilities of large language models, particularly in tasks requiring multi-hop reasoning and understanding of complex textual information. The DROP metric is now a key component in ranking open-source models, providing a more nuanced view of their performance beyond traditional benchmarks. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
RANK_REASON Hugging Face's Open LLM Leaderboard updated with a new evaluation metric for open-source models.