PulseAugur
实时 10:49:15

Next Forcing 通过多块预测加速视频生成

研究人员推出了一种新颖的多块预测框架“Next Forcing”,旨在增强自回归视频生成中的因果世界建模。该方法受大型语言模型的启发,可同时预测多个未来的视频块,提供更密集的时序监督并加速训练收敛。该框架在 RoboTwinPhyWorld 等基准测试中取得了最先进的结果,同时还将推理速度提高了一倍。 AI

影响 加速自回归视频生成模型的训练和推理,可能支持更复杂的实时应用。

排序理由 这是一篇详细介绍视频生成新方法的学术论文。

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

报道来源 [3]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    Next Forcing: Causal World Modeling with Multi-Chunk Prediction

    Autoregressive video generation has emerged as a powerful paradigm for World Action Models (WAMs). However, existing approaches suffer from slow training convergence and limited converged accuracy, particularly at high frame rates, as the training supervision is confined to the c…

  2. arXiv cs.CV TIER_1 English(EN) · Gangwei Xu, Qihang Zhang, Jiaming Zhou, Xing Zhu, Yujun Shen, Xin Yang, Yinghao Xu ·

    Next Forcing: Causal World Modeling with Multi-Chunk Prediction

    arXiv:2606.11187v1 Announce Type: new Abstract: Autoregressive video generation has emerged as a powerful paradigm for World Action Models (WAMs). However, existing approaches suffer from slow training convergence and limited converged accuracy, particularly at high frame rates, …

  3. arXiv cs.CV TIER_1 English(EN) · Yinghao Xu ·

    Next Forcing: Causal World Modeling with Multi-Chunk Prediction

    Autoregressive video generation has emerged as a powerful paradigm for World Action Models (WAMs). However, existing approaches suffer from slow training convergence and limited converged accuracy, particularly at high frame rates, as the training supervision is confined to the c…