Researchers have developed VideoMDM, a novel diffusion-based framework for generating 3D human motion from 2D video supervision. This method trains 3D motion priors directly from 2D poses, bypassing the need for explicit 3D ground truth data. By using a pretrained 2D-to-3D lifter as a noisy teacher and employing a depth-weighted 2D reprojection loss, VideoMDM achieves performance close to fully 3D-supervised models on benchmarks like HumanML3D. The framework also demonstrates success on real-world video datasets such as Fit3D and NBA, generating motions that are preferred by human evaluators. AI
IMPACT Enables more accessible 3D motion generation for applications like animation and virtual reality by leveraging readily available 2D video data.
RANK_REASON This is a research paper detailing a new method for 3D human motion generation.
Read on Hugging Face Daily Papers →
AI-generated summary · Google Gemini · from 3 sources. How we write summaries →