WorldPlay模型支持实时交互式视频生成

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-10 04:00

研究人员开发了WorldPlay，这是一种新颖的流式视频扩散模型，专为实时交互式世界建模而设计。该模型通过采用双动作表示（Dual Action Representation）来实现强大的输入控制，并使用具有时间重构（temporal reframing）的重构上下文记忆（Reconstituted Context Memory）来保持长期的几何一致性，从而解决了当前系统中的速度-内存权衡问题。此外，上下文强制（Context Forcing）是一种蒸馏方法，可确保模型有效利用长距离信息，从而实现以24 FPS生成实时720p视频，并提高一致性和泛化能力。 AI

影响引入了一种新的实时交互式视频生成方法，提高了其一致性，可能对内容创作和模拟工具产生影响。

排序理由这是一篇描述新模型和方法论的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

Wenqiang Sun

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Wenqiang Sun, Haiyu Zhang, Haoyuan Wang, Junta Wu, Zehan Wang, Zhenwei Wang, Yunhong Wang, Jun Zhang, Tengfei Wang, Chunchao Guo · 2026-06-10 04:00

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

arXiv:2512.14614v2 Announce Type: replace Abstract: This paper presents WorldPlay, a streaming video diffusion model that enables real-time, interactive world modeling with long-term geometric consistency, resolving the trade-off between speed and memory that limits current metho…

报道来源 [1]

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

相关话题