English(EN) VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection

新的VistaRef框架提升物体检测中的空间定向感知能力 · 已追踪2个来源

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-23 12:30

研究人员推出了一种名为VistaRef的新框架，旨在提高指向物体检测任务中的空间定向感知能力。该系统解决了现有基于Transformer的模型中存在的局限性，这些模型常常忽略细粒度的几何关系，导致指向定位不准确。VistaRef包含一个局部手部实体建模模块，以更好地捕捉手指偏差，以及一个几何射线建模模块，将方向信息转换为显式的空间特征。方向一致性对齐损失进一步优化了手部存在和指向一致性，与基线模型相比，在地面化准确性上取得了显著的14个百分点的绝对提升。 AI

影响通过改进模型对指向手势的理解，提高了AR和机器人领域空间交互的精度。

排序理由该集群包含一篇研究论文，详细介绍了一种针对特定计算机视觉任务的新框架和方法论。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Ling Li, Zhizhen Cai, Xinkun Wu, Ziyu Zhu, Jiaqing Lyu, Bowen Liu, Zhidong Deng · 2026-06-24 04:00

VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection

arXiv:2606.24498v1 Announce Type: new Abstract: Grounding deictic gestures in natural images is fundamental to AR and human-robot collaboration, providing a basis for seamless spatial interaction. While Transformer-based visual models have achieved significant progress in general…
arXiv cs.CV TIER_1 English(EN) · Zhidong Deng · 2026-06-23 12:30

VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection

Grounding deictic gestures in natural images is fundamental to AR and human-robot collaboration, providing a basis for seamless spatial interaction. While Transformer-based visual models have achieved significant progress in general object detection, their global attention mechan…

报道来源 [2]

VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection

VistaRef: Boosting Visual Spatial Orientation Awareness for Pointing-to-Object Detection

相关实体

相关话题