English(EN) Latent Action Pretraining Through World Modeling

新的 LAWM 框架能够从视频中实现机器人自监督学习

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-16 04:00

研究人员推出了一种新颖的机器人模仿学习模型自监督预训练框架 LAWM。这种模型无关的方法通过对帧之间的抽象视觉变化进行建模，从无标签视频数据中学习潜在动作表示，从而能够跨不同任务、环境和具身实现知识迁移。与使用真实动作或其他自监督方法预训练的模型相比，LAWM 在 LIBERO 基准测试和真实机器人设置中表现出更优越的性能，同时计算效率也更高。 AI

影响这项研究通过减少对手动标记数据的依赖，有望实现更高效、更易于访问的机器人学习。

排序理由该集群包含一篇详细介绍机器人新研究框架的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · Bahey Tharwat, Yara Nasser, Ali Abouzeid, Ian Reid · 2026-06-16 04:00

Latent Action Pretraining Through World Modeling

arXiv:2509.18428v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models have gained popularity for learning robotic manipulation tasks that follow language instructions. State-of-the-art VLAs, such as OpenVLA and $\pi_{0}$, were trained on large-scale, manua…

报道来源 [1]

Latent Action Pretraining Through World Modeling

相关实体

相关话题