Hugging Face has updated its Open LLM Leaderboard to incorporate a new evaluation metric called DROP (Discrete Reasoning Over Paragraphs). This addition aims to better assess the reasoning capabilities of large language models, particularly in tasks requiring multi-hop reasoning and understanding of complex textual information. The DROP metric is now a key component in ranking open-source models, providing a more nuanced view of their performance beyond traditional benchmarks. AI
RANK_REASON Hugging Face's Open LLM Leaderboard updated with a new evaluation metric for open-source models.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →