PulseAugur
实时 15:14:31
English(EN) AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration

新的AirGroundBench基准测试探测多模态大模型(MLLMs)的空间智能

研究人员推出AirGroundBench,这是一个旨在评估多模态大语言模型(MLLMs)在协作空地场景中空间智能的新基准。该基准通过关注无人机(UAV)和无人地面车辆(UGV)联合操作固有的异构视图、尺度不匹配和参考系不一致性,解决了现有评估的局限性。使用AirGroundBench进行的评估显示,虽然当前MLLMs在基本空间感知方面表现尚可,但在跨视图对齐和密集转换推理方面存在显著困难,影响了它们在视觉语言导航任务中的顺序决策能力。 AI

影响 强调了当前MLLMs在空间推理方面的关键局限性,指导未来研究朝着更强大的具身AI方向发展。

排序理由 该集群描述了一个用于评估AI模型的新研究基准。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

新的AirGroundBench基准测试探测多模态大模型(MLLMs)的空间智能

报道来源 [2]

  1. arXiv cs.CV TIER_1 English(EN) · Haotian Li, Yida Wang, Leyuan Wang, Jinshan Lai, Keyang Wang, Zonghao Guo, Qiang Ma, Liuyu Xiang, Jianwei Hu, Zhaofeng He ·

    AirGroundBench: Probing Spatial Intelligence in Multimodal Large Models under Heterogeneous Multi-View Embodied Collaboration

    arXiv:2606.28049v1 Announce Type: new Abstract: In recent years, multimodal large language models (MLLMs) have shown strong potential for embodied intelligence, yet their ability to maintain geometrically consistent spatial understanding across heterogeneous views remains under-e…

  2. arXiv cs.CV TIER_1 English(EN) · Zhaofeng He ·

    AirGroundBench:在异构多视角具身协作下探究多模态大模型的空间智能

    In recent years, multimodal large language models (MLLMs) have shown strong potential for embodied intelligence, yet their ability to maintain geometrically consistent spatial understanding across heterogeneous views remains under-evaluated. Existing benchmarks largely focus on s…