English(EN) RT Mercor: APEX-Agents now has a @huggingface leaderboard for open-source models. APEX-Agents is our frontier benchmark for whether models can do the ...

Hugging Face推出APEX-Agents开源模型排行榜

作者 PulseAugur 编辑部 · [1 个来源] · 2026-04-30 18:35

Mercor已在Hugging Face上推出了APEX-Agents排行榜，用于评估开源模型。该基准测试评估模型执行通常由顾问、律师和银行家等专业人士处理的任务的能力。该排行榜旨在跟踪这些复杂、现实世界应用中的进展和性能。 AI

影响为评估开源模型在专业领域的代理能力提供了一个新的基准。

排序理由推出用于评估开源模型的新基准数据集和排行榜。

在 X — Hugging Face 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

X — Hugging Face TIER_1 English(EN) · Hugging Face · 2026-04-30 18:35

RT Mercor：APEX-Agents现已推出开源模型@huggingface排行榜。APEX-Agents是我们用于衡量模型能力的前沿基准测试，旨在...

RT Mercor APEX-Agents now has a @huggingface leaderboard for open-source models. APEX-Agents is our frontier benchmark for whether models can do the real work of consultants, lawyers, and bankers. https://huggingface.co/datasets/mercor/apex-agents …

报道来源 [1]

RT Mercor：APEX-Agents现已推出开源模型@huggingface排行榜。APEX-Agents是我们用于衡量模型能力的前沿基准测试，旨在...

相关实体

相关话题