English(EN) CineDance: Towards Next-Generation Multi-Shot Long-Form Cinematic Audio-Video Generation

CineDance-1M 数据集推动开源电影级视频生成

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-08 15:35

研究人员推出了 CineDance-1M，这是一个用于开源文本到音视频生成的超大规模数据集，旨在提高电影叙事能力。该数据集包含平均时长为 92.8 秒、24.2 个镜头的长篇视频，并通过三阶段策展过程获得的结构化音视频标注提供支持。为了评估性能，他们还提出了 CineBench，一个用于复杂音视频叙事的新指标系统，并展示了一个经过调整的 LTX-2.3 模型，该模型显示出强大的对齐和一致性。 AI

影响为加速长篇电影级音视频生成领域的开源研究提供了基础数据集和评估工具。

排序理由该集群包含一篇详细介绍人工智能研究新数据集和基准的学术论文。

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.CV TIER_1 English(EN) · Yuheng Chen, Teng Hu, Yuji Wang, Qingdong He, Zhucun Xue, Qianyu Zhou, Xiangtai Li, Lizhuang Ma, Jiangning Zhang, Dacheng Tao · 2026-06-09 04:00

CineDance：迈向下一代多镜头长篇电影音频视频生成

arXiv:2606.09639v1 Announce Type: new Abstract: The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video generation models. While commercial systems showremarkableabilitytogeneratecinematicnarratives, the progress of open-source…
arXiv cs.CV TIER_1 English(EN) · Dacheng Tao · 2026-06-08 15:35

CineDance：迈向下一代多镜头长篇电影音频视频生成

The fidelity and structural diversity of training datasets fundamentally determine the capabilities of video generation models. While commercial systems showremarkableabilitytogeneratecinematicnarratives, the progress of open-source models remains limited by the scarcity of high-…

报道来源 [2]

CineDance：迈向下一代多镜头长篇电影音频视频生成

CineDance：迈向下一代多镜头长篇电影音频视频生成

相关实体

相关话题