New KINA benchmark ranks Gemini 3.1 Pro highest, surpassing Claude and GPT-5

By PulseAugur Editorial · [2 sources] · 2026-06-03 17:06

A new benchmark called KINA has been introduced to evaluate large language models across 261 fine-grained disciplines, addressing issues of scaling-driven design and annotation quality. The benchmark, comprising 899 items, was used to evaluate 42 models from 13 different labs. Gemini-3.1-Pro-Preview emerged as the top performer with a score of 53.17%, followed by Claude-Opus-4.6 and GPT-5.4, indicating substantial room for improvement across models. AI

IMPACT Establishes a new evaluation standard for LLMs, highlighting performance tiers and the impact of tool augmentation.

RANK_REASON The cluster contains a research paper introducing a new benchmark for LLMs and reporting evaluation results.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New KINA benchmark ranks Gemini 3.1 Pro highest, surpassing Claude and GPT-5

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Sheng Jin, Minghao Liu, Yunze Xiao, Zeqi Zhou, Heli Qi, Yifan Yao, Meishu Song, Kaijing Ma, Xuan Zhang, Sicong Jiang, Yizhe Li, Ningshan Ma, Jie Wei, Ziniu Li, Minglai Yang, Bangya Liu, Yiming Liang, Xiao Fang, Qingcheng Zeng, Jiarui Liu, Rui Yang, Shen … · 2026-06-04 04:00

Knowledge Index of Noah's Ark

arXiv:2606.05104v1 Announce Type: new Abstract: Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-payment annotation that permits lazy consensus; and unaudited ranking instability under bounded…
arXiv cs.AI TIER_1 English(EN) · Ge Zhang · 2026-06-03 17:06

Knowledge Index of Noah's Ark

Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-payment annotation that permits lazy consensus; and unaudited ranking instability under bounded test budgets. We introduce KINA, an 899-item be…

COVERAGE [2]

Knowledge Index of Noah's Ark

Knowledge Index of Noah's Ark

RELATED ENTITIES

RELATED TOPICS