实体
Elo
Elo
PulseAugur coverage of Elo — every cluster mentioning Elo across labs, papers, and developer communities, ranked by signal.
总计 · 30天
2
90 天内 2
发布 · 30天
0
90 天内 0
论文 · 30天
2
90 天内 2
层级分布 · 90 天
最近 · 第 1/1 页 · 共 2 条
-
Study finds global LLM leaderboards misleading, proposes portfolio rankings
A new research paper argues that current leaderboards for large language models (LLMs) are misleading due to significant heterogeneity in user preferences across languages and tasks. The study analyzed approximately 89,…
-
Chess-GPT model learns world model, can be manipulated to change skill
Researchers have explored interventions on a language model trained to play chess, dubbed Chess-GPT. By manipulating the model's internal representations of the board state and player skill, they demonstrated a causal l…