PulseAugur
EN
LIVE 12:19:51

OSP-Next video model achieves 83.73% VBench score with efficiency gains

Researchers have introduced OSP-Next, a novel text-to-video generation model designed for enhanced efficiency and quality. The model integrates sparse attention mechanisms, a novel Sparse Sequence Parallelism (SSP) technique that reduces communication volume by 75% compared to existing methods, and HiF8 quantization for stable 8-bit training. Experiments demonstrate that OSP-Next achieves a VBench score of 83.73%, outperforming the Wan2.1 baseline, and offers significant speedups on various hardware platforms, including NVIDIA H200 and Ascend 950PR GPUs. AI

IMPACT OSP-Next demonstrates significant efficiency gains and improved quality in text-to-video generation, potentially accelerating research and development in the field.

RANK_REASON The cluster contains a research paper detailing a new model and techniques for video generation.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

OSP-Next video model achieves 83.73% VBench score with efficiency gains

COVERAGE [3]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning

    OSP-Next is an efficient text-to-video generation model that combines sparse attention, parallelism, quantization, and reinforcement learning to achieve high-quality video synthesis with reduced computational costs.

  2. arXiv cs.CV TIER_1 English(EN) · Yunyang Ge, Xianyi He, Zezhong Zhang, Bin Lin, Bin Zhu, Xinhua Cheng, Li Yuan ·

    OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning

    arXiv:2605.28691v1 Announce Type: new Abstract: Diffusion Transformers achieve strong video generation quality, but the quadratic cost of full attention limits efficiency. We introduce OSP-Next, an efficient text-to-video generation model that integrates sparse attention, paralle…

  3. arXiv cs.CV TIER_1 English(EN) · Li Yuan ·

    OSP-Next: Efficient High-Quality Video Generation with Sparse Sequence Parallelism, HiF8 Quantization, and Reinforcement Learning

    Diffusion Transformers achieve strong video generation quality, but the quadratic cost of full attention limits efficiency. We introduce OSP-Next, an efficient text-to-video generation model that integrates sparse attention, parallelism, quantization, and reinforcement learning. …