English(EN) ERGeoBench:A Comprehensive Benchmark for Embodied Reasoning and Geo-localization in Multimodal Large Language Models

新基准ERGeoBench测试LLM的具身地理定位能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-29 12:49

研究人员推出了ERGeoBench，这是一个旨在评估多模态大语言模型（MLLMs）作为具身代理时的地理定位能力的新基准。该基准利用超过2200张街景全景图，在单视图、全景视图和具身视图设置下评估模型。评估表明，尽管当前的MLLMs能够掌握高级地理概念，但在精确的度量定位和跨不同视图保持空间一致性方面仍面临挑战，这凸显了整合感知和推理的必要性。 AI

影响为具身AI代理提供标准化评估，推动空间推理和地理定位领域的发展。

排序理由该集群包含一篇介绍AI模型评估新基准的研究论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Kaiwen Xue, Tao Wei, Guoxin Zhang, Zhonghong Ou, Kaoyan Lu, Yu Feng, Yifan Zhu, Haoran Luo · 2026-06-01 04:00

ERGeoBench：多模态大语言模型具身推理与地理定位的综合基准

arXiv:2605.31251v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have shown strong potential as embodied agents, yet embodied geo-localization remains underexplored due to the lack of fine-grained evaluation. We introduce ERGeoBench, a diagnostic benchma…
arXiv cs.AI TIER_1 English(EN) · Haoran Luo · 2026-05-29 12:49

ERGeoBench：多模态大语言模型具身推理与地理定位的综合基准

Multimodal large language models (MLLMs) have shown strong potential as embodied agents, yet embodied geo-localization remains underexplored due to the lack of fine-grained evaluation. We introduce ERGeoBench, a diagnostic benchmark for vision-driven embodied geo-localization. ER…

报道来源 [2]

ERGeoBench：多模态大语言模型具身推理与地理定位的综合基准

ERGeoBench：多模态大语言模型具身推理与地理定位的综合基准

相关实体

相关话题