MoCapAnything V2 enables end-to-end motion capture for arbitrary skeletons

By PulseAugur Editorial · [3 sources] · 2026-04-30 17:16

Researchers have developed MoCapAnything V2, an end-to-end system for 3D motion capture from monocular video that accommodates arbitrary skeletons. This new framework integrates pose and rotation prediction into a single, jointly optimized learnable process, addressing limitations of previous factorized pipelines. By introducing reference pose-rotation pairs, the system resolves ambiguities in rotation prediction and achieves significantly improved accuracy, reducing rotation error to approximately 6.54 degrees on unseen skeletons. The method also enhances efficiency by predicting joint positions directly from video without intermediate meshes, resulting in up to 20x faster inference. AI

IMPACT Enhances 3D animation capabilities by enabling more accurate and efficient motion capture for diverse digital assets.

RANK_REASON The cluster describes a new academic paper detailing an improved method for 3D motion capture.

Read on arXiv cs.CV →

paper
other

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

COVERAGE [3]

arXiv cs.CV TIER_1 English(EN) · Kehong Gong, Zhengyu Wen, Dao Thien Phong, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Guanli Hou, Dongze Lian, Xiaoyu He, Mingyuan Zhang, Hanwang Zhang · 2026-05-01 04:00

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

arXiv:2604.28130v1 Announce Type: new Abstract: Recent methods for arbitrary-skeleton motion capture from monocular video follow a factorized pipeline, where a Video-to-Pose network predicts joint positions and an analytical inverse-kinematics (IK) stage recovers joint rotations.…
arXiv cs.CV TIER_1 English(EN) · Kehong Gong, Zhengyu Wen, Weixia He, Mingxi Xu, Qi Wang, Ning Zhang, Zhengyu Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang · 2026-05-01 04:00

MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

arXiv:2512.10881v2 Announce Type: replace Abstract: Motion capture now underpins content creation far beyond digital humans, yet most existing pipelines remain species- or template-specific. We formalize this gap as Category-Agnostic Motion Capture (CAMoCap): given a monocular vi…
arXiv cs.CV TIER_1 English(EN) · Hanwang Zhang · 2026-04-30 17:16

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Recent methods for arbitrary-skeleton motion capture from monocular video follow a factorized pipeline, where a Video-to-Pose network predicts joint positions and an analytical inverse-kinematics (IK) stage recovers joint rotations. While effective, this design is inherently limi…

COVERAGE [3]

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

RELATED ENTITIES

RELATED TOPICS