FlowLong: Inference-time Long Video Generation via Manifold-constrained Tweedie Matching
Researchers have developed FlowLong, a novel inference-time method to extend the generation capabilities of video diffusion models for longer sequences. This approach uses overlapping sliding windows and a technique called Tweedie matching to ensure temporal consistency and maintain visual quality without requiring additional training. FlowLong is architecture-agnostic and has demonstrated success in extending video generation length while also being applicable to audio-video joint generation and text-to-3D scene generation. AI
IMPACT Enables longer, more consistent video generation from diffusion models without additional training.