English(EN) VISTA: Technical Report for the Ego4D Short-Term Object Interaction Anticipation at EgoVis 2026

VISTA 系统凭借物体交互预测能力赢得 Ego4D 挑战赛

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-20 08:42

研究人员开发了 VISTA，一个用于预测第一人称视角视频中人类与物体交互的新型系统。VISTA 集成了空间物体检测和来自冻结的 V-JEPA 2.1 模型的时序上下文来预测未来的交互。该方法在 EgoVis 2026 年 Ego4D 短期物体交互预测挑战赛中获得第一名。 AI

影响为第一人称视角视频分析和人类-物体交互预测树立了新的标杆。

排序理由该集群包含一份技术报告，详细介绍了一个赢得特定挑战赛的新型系统。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Qiaohui Chu, Haoyu Zhang, Yisen Feng, Meng Liu, Weili Guan, Dongmei Jiang, Liqiang Nie · 2026-05-22 04:00

VISTA: EgoVis 2026 短期物体交互预测技术报告

arXiv:2605.20901v1 Announce Type: cross Abstract: We propose VISTA, a V-JEPA Integrated StillFast Temporal Anticipator for the Ego4D Short-Term Object Interaction Anticipation (STA) Challenge at EgoVis 2026. Given an egocentric video timestamp, the task requires anticipating the …
arXiv cs.AI TIER_1 English(EN) · Liqiang Nie · 2026-05-20 08:42

VISTA: EgoVis 2026 短期物体交互预测技术报告

We propose VISTA, a V-JEPA Integrated StillFast Temporal Anticipator for the Ego4D Short-Term Object Interaction Anticipation (STA) Challenge at EgoVis 2026. Given an egocentric video timestamp, the task requires anticipating the next human-object interaction, including the futur…