English(EN) Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?

新基准测试基础模型的主动3D导航能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-31 00:00

研究人员推出TVRBench，这是一个旨在测试基础模型在3D环境中主动导航以匹配目标图像视角能力的新基准。当前模型在此任务上面临巨大挑战，尤其是在需要身体平移或处理多轮视觉历史时。一个统一的训练后框架，特别是视觉-动作监督微调，显示出显著的改进，将一个9B模型的成功率提高到50%以上。该基准旨在推动能够感知和行动于3D空间中的模型的发展。 AI

影响为评估和训练基础模型中的具身空间智能建立了新的基准，突显了当前的局限性和潜在的训练途径。

排序理由该集群包含一篇介绍新基准以评估基础模型的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-31 00:00

何处着手：基础模型能否通过主动探索达到目标视角？

Target Viewpoint Reproduction task challenges foundation models to actively adjust 3D viewpoints to match target images, revealing limitations in visual history processing and embodied movement mapping, with a unified post-training framework improving success rates through variou…
arXiv cs.CV TIER_1 English(EN) · Liyang Li, Muzhi Zhu, Zhiyue Zhao, Hengyu Zhao, Ke Liu, Linhao Zhong, Hao Chen, Chunhua Shen · 2026-06-02 04:00

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?

arXiv:2606.01247v1 Announce Type: new Abstract: Humans can reproduce the viewpoint specified by a target image through active head and body motion, yet spatial intelligence in foundation models has largely been studied as passive understanding of pre-collected observations. We in…

报道来源 [2]

何处着手：基础模型能否通过主动探索达到目标视角？

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?

相关实体

相关话题