Researchers have introduced SceneScribe-1M, a new large-scale video dataset designed to bridge the gap between 3D geometric perception and video synthesis. The dataset contains one million in-the-wild videos, each annotated with textual descriptions, camera parameters, depth maps, and 3D point tracks. SceneScribe-1M aims to serve as a comprehensive benchmark for tasks like depth estimation and scene reconstruction, as well as generative tasks such as text-to-video synthesis. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Provides a new benchmark dataset for advancing both 3D perception and video generation models.
RANK_REASON This is a research paper describing a new dataset.