English(EN) Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

视觉导航模型在真实世界测试中频繁碰撞且鲁棒性差

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-17 04:00

一项新的研究论文在真实世界场景中评估了五种最先进的视觉导航模型（VNMs），揭示了它们在简单成功率之外的显著局限性。由Maeva Guerrier及其同事进行的研究发现，GNM、ViNT、NoMaD、NaviBridger和CrossFormer等模型频繁与物体碰撞，表明它们缺乏几何理解能力。此外，这些模型难以区分感知上相似的位置，并且在运动模糊或日照等环境变化下性能会下降。研究人员计划发布他们的评估代码库和数据集，以促进可复现的基准测试。 AI

影响揭示了当前视觉导航模型的关键局限性，强调了在真实世界机器人应用中需要改进几何理解和鲁棒性。

排序理由评估现有模型的论文，采用新指标。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Maeva Guerrier, Karthik Soma, Jana Pavlasek, Giovanni Beltrame · 2026-06-17 04:00

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

arXiv:2603.25937v2 Announce Type: replace-cross Abstract: Visual Navigation Models (VNMs) promise generalizable, robot navigation by learning from large-scale visual demonstrations. Despite growing real-world deployment, existing evaluations rely almost exclusively on success rat…

报道来源 [1]

Can Vision Foundation Models Navigate? Zero-Shot Real-World Evaluation and Lessons Learned

相关实体

相关话题