English(EN) Physical Object Understanding with a Physically Controllable World Model

新的世界模型从视频中学习物理定律和物体操作

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-02 04:00

研究人员开发了一种新颖的概率世界模型，该模型能够从视频数据中理解场景的物理结构。该模型可以推断分布状态，预测未来的物理交互，甚至在3D中操纵物体。通过分析运动相关性，该系统可以识别物体及其子部分，从而实现视觉叠叠乐等应用。 AI

影响引入了一种新的视觉智能方法，有望提高AI理解和与物理世界交互的能力。

排序理由该集群包含一篇详细介绍新型概率世界模型的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Rahul Venkatesh, Klemen Kotar, Lilian Naing Chen, Wanhee Lee, Gia Ancone, Seungwoo Kim, Luca Thomas Wheeler, Jared Watrous, Honglin Chen, Daniel Bear, Stefan Stojanov, Daniel LK Yamins · 2026-06-02 04:00

Physical Object Understanding with a Physically Controllable World Model

arXiv:2606.00439v1 Announce Type: new Abstract: A central challenge in visual intelligence is learning the physical structure of scenes from raw videos: how regions form objects and the laws that govern their interactions. Solving these tasks requires world models capable of infe…