ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining
Researchers have introduced ACE-Ego-0, a novel pretraining framework designed to unify diverse data sources for Vision-Language-Action (VLA) models. This framework addresses the challenge of integrating human egocentric videos with robot trajectory data by converting human videos into robot-format pseudo-action trajectories. ACE-Ego-0 employs a reliability-aware training objective to effectively utilize noisy human-generated action data, leading to improved performance on embodied AI tasks. AI