WorldMark: A Unified Benchmark Suite for Interactive Video World Models

作者 PulseAugur 编辑部 · [4 个来源] · 2024-02-15 08:00

OpenAI发布了Sora，一个能够生成长达一分钟高保真视频的视频生成模型，它采用了扩散Transformer架构，将视频和图像数据处理为空时块。这种方法使Sora能够处理可变的持续时间、分辨率和宽高比，旨在创建物理世界的通用模拟器。同时，一个新的名为WorldMark的基准套件被引入，用于标准化交互式视频世界模型的评估，解决了之前不同模型之间缺乏可比指标的问题。 AI

排序理由 OpenAI发布了Sora，一个前沿的视频生成模型，并附带了一份技术报告，详细介绍了其功能。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 4 个来源。我们如何撰写摘要 →

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

报道来源 [4]

OpenAI News TIER_1 English(EN) · 2024-02-15 08:00

视频生成模型作为世界模拟器

We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patche…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-04-23 13:50

WorldMark：交互式视频世界模型的统一基准套件

Interactive video generation models such as Genie, YUME, HY-World, and Matrix-Game are advancing rapidly, yet every model is evaluated on its own benchmark with private scenes and trajectories, making fair cross-model comparison impossible. Existing public benchmarks offer useful…
Synced Review TIER_1 English(EN) · Synced · 2025-05-28 09:31

Adobe Research 使用状态空间模型解锁视频世界模型的长期记忆

<p>By combining State-Space Models (SSMs) for efficient long-range dependency modeling with dense local attention for coherence, and using training strategies like diffusion forcing and frame local attention, researchers from Adobe Research successfully overcome the long-standing…
arXiv cs.CV TIER_1 English(EN) · Yongtao Ge · 2026-04-23 13:50

WorldMark：交互式视频世界模型的统一基准套件

Interactive video generation models such as Genie, YUME, HY-World, and Matrix-Game are advancing rapidly, yet every model is evaluated on its own benchmark with private scenes and trajectories, making fair cross-model comparison impossible. Existing public benchmarks offer useful…

报道来源 [4]

视频生成模型作为世界模拟器

WorldMark：交互式视频世界模型的统一基准套件

Adobe Research 使用状态空间模型解锁视频世界模型的长期记忆

WorldMark：交互式视频世界模型的统一基准套件

相关实体

相关话题