PulseAugur
EN
LIVE 11:35:14

CineDance-1M dataset advances open-source cinematic video generation

Researchers have introduced CineDance-1M, a large-scale dataset for open-source text-to-audio-video generation, aiming to improve cinematic narrative capabilities. The dataset features long-form videos with an average of 92.8 seconds and 24.2 shots, supported by structured audio-video annotations derived from a three-stage curation process. To evaluate performance, they also propose CineBench, a new metric system for complex audio-video narratives, and demonstrate an adapted LTX-2.3 model that shows strong alignment and consistency. AI

IMPACT Provides a foundational dataset and evaluation tools to accelerate open-source research in long-form cinematic audio-video generation.

RANK_REASON The cluster contains an academic paper detailing a new dataset and benchmark for AI research.

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.CV TIER_1 English(EN) · Yuheng Chen, Teng Hu, Yuji Wang, Qingdong He, Zhucun Xue, Qianyu Zhou, Xiangtai Li, Lizhuang Ma, Jiangning Zhang, Dacheng Tao ·

    CineDance: Towards Next-Generation Multi-Shot Long-Form Cinematic Audio-Video Generation

    arXiv:2606.09639v1 Announce Type: new Abstract: The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video generation models. While commercial systems showremarkableabilitytogeneratecinematicnarratives, the progress of open-source…

  2. arXiv cs.CV TIER_1 English(EN) · Dacheng Tao ·

    CineDance: Towards Next-Generation Multi-Shot Long-Form Cinematic Audio-Video Generation

    The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video generation models. While commercial systems showremarkableabilitytogeneratecinematicnarratives, the progress of open-source models remains limited by the scarcity of high-…