New methods enhance autoregressive video generation quality and efficiency

By PulseAugur Editorial · [14 sources] · 2026-05-15 14:33

Researchers are developing new methods to improve autoregressive video generation, focusing on efficiency and quality. One approach, One-Forcing, combines a DMD objective with a GAN loss to achieve stable, high-quality one-step video generation, outperforming existing one-step methods on benchmarks. Another technique, DySink, uses a retrieval-based framework with dynamic frame sinks to maintain adaptive long-range context and prevent generation collapse in longer videos. Additionally, Adversarial Flow Distillation (AFD) offers an on-policy method for distilling heterogeneous black-box video generators into efficient autoregressive students without requiring teacher scores. AI

IMPACT New methods promise more stable, efficient, and higher-quality video generation, potentially enabling new applications in real-time interactive content and world simulation.

RANK_REASON Multiple research papers introduce novel techniques for improving autoregressive video generation.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 14 sources. How we write summaries →

New methods enhance autoregressive video generation quality and efficiency

COVERAGE [14]

arXiv cs.AI TIER_1 English(EN) · Jiaqi Feng, Justin Cui, Yuanhao Ban, Cho-Jui Hsieh · 2026-05-25 04:00

One-Forcing: Towards Stable One-Step Autoregressive Video Generation

arXiv:2605.23458v1 Announce Type: cross Abstract: Recent advances have substantially improved real-time interactive video generation in the autoregressive regime. However, most existing few-step autoregressive video generation methods, often distilled from a corresponding many-st…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-25 00:00

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Adversarial Flow Distillation enables efficient distillation of heterogeneous video generation models by using on-policy feedback and forward-process flow-matching updates without requiring teacher scores or detailed trajectory information.
arXiv cs.AI TIER_1 English(EN) · Bo Ye, Xinyu Cui, Jian Zhao, Tong Wei, Min-Ling Zhang · 2026-05-22 04:00

DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation

arXiv:2605.21028v1 Announce Type: cross Abstract: Autoregressive long video generation often adopts bounded-memory streaming for efficiency, typically combining local windows for short-term continuity with static early-frame sinks as long-range anchors. However, this fixed alloca…
arXiv cs.AI TIER_1 English(EN) · Min-Ling Zhang · 2026-05-20 11:01

DySink: Dynamic Frame Sinks for Autoregressive Long Video Generation

Autoregressive long video generation often adopts bounded-memory streaming for efficiency, typically combining local windows for short-term continuity with static early-frame sinks as long-range anchors. However, this fixed allocation keeps early frames cached even when the curre…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-19 01:28

PhyWorld: Physics-Faithful World Model for Video Generation

World simulators can provide safe and scalable environments for training Physical AI systems before real-world deployment. Large video generation models are emerging as a promising basis for such simulators because they can generate diverse and realistic visual futures. However, …
arXiv cs.CV TIER_1 English(EN) · Yang Luo, Shengju Qian, Xiaohang Tang, Zirui Zhu, Yong Liu, Xin Wang, Yang You · 2026-05-26 04:00

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

arXiv:2605.26105v1 Announce Type: new Abstract: Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout …
arXiv cs.CV TIER_1 English(EN) · Yang You · 2026-05-25 17:58

On-Policy Adversarial Flow Distillation for Autoregressive Video Generation

Autoregressive video generators are attractive for streaming, long-horizon, and interactive applications, but distilling strong black-box teachers into causal students remains difficult. The student must learn under its own rollout distribution, whereas practical teachers may exp…
arXiv cs.CV TIER_1 English(EN) · Cho-Jui Hsieh · 2026-05-22 10:16

One-Forcing: Towards Stable One-Step Autoregressive Video Generation

Recent advances have substantially improved real-time interactive video generation in the autoregressive regime. However, most existing few-step autoregressive video generation methods, often distilled from a corresponding many-step teacher, default to a 4-step sampling configura…
arXiv cs.CV TIER_1 English(EN) · Hongzhou Zhu, Min Zhao, Guande He, Hang Su, Chongxuan Li, Jun Zhu · 2026-05-22 04:00

Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation

arXiv:2602.02214v3 Announce Type: replace Abstract: To achieve real-time interactive video generation, current methods distill pretrained bidirectional video diffusion models into few-step autoregressive (AR) models, facing an architectural gap when full attention is replaced by …
arXiv cs.CV TIER_1 English(EN) · Sheng Li, Yang Sui, Junhao Ran, Bo Yuan, Yue Dai, Xulong Tang · 2026-05-22 04:00

Temporal Aware Pruning for Efficient Diffusion-based Video Generation

arXiv:2605.17837v2 Announce Type: replace Abstract: Video diffusion models have recently enabled high-quality video generation with ViT-based architectures, but remain computationally intensive because generation requires attention computation over long spatiotemporal sequences. …
arXiv cs.CV TIER_1 English(EN) · Linfeng Zhang · 2026-05-20 11:24

Dynamic Video Generation: Shaping Video Generation Across Time and Space

Diffusion models have achieved impressive performance in video generation, but their iterative denoising process remains computationally expensive due to the large number of tokens processed at each timestep. Recently, progressive resolution sampling has emerged as a promising ac…
arXiv cs.CV TIER_1 English(EN) · Jong Chul Ye · 2026-05-20 08:55

FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching

Extending the generation horizon of video diffusion models to long sequences remains a long-standing and important challenge. Existing training-free approaches fall into two categories: extensions of bidirectional models, which are tightly coupled to specific architectures and su…
arXiv cs.CV TIER_1 English(EN) · K. Huang · 2026-05-18 11:28

Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos

Without incurring significant computational overhead, train-free long video generation aims to enable foundation video generation models to produce longer videos. Frame-level autoregressive frameworks, e.g., FIFO-diffusion, offer the advantage of generating infinitely long videos…
arXiv cs.CV TIER_1 English(EN) · Chuanguang Yang · 2026-05-15 14:33

Echo-Forcing: A Scene Memory Framework for Interactive Long Video Generation

Autoregressive video diffusion models enable open-ended generation through local attention and KV caching. However, existing training-free long-video optimization methods mainly focus on stable extension under a single prompt, making them difficult to handle interactive scenarios…

COVERAGE [14]

RELATED ENTITIES

RELATED TOPICS