新的ATOM-Bench基准测试机器人操作的泛化能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-15 15:08

研究人员推出了ATOM-Bench，这是一个新的真实世界基准，旨在评估机器人操作策略的原子技能和组合泛化能力。该基准包括30个原子任务和24个未包含的组合任务，利用3000个人类演示进行微调和评估。对五个代表性策略的初步测试显示，尽管当前模型可以掌握基本的指令理解，但在细粒度运动技能和可靠地组合所学技能以完成新任务方面存在困难。 AI

影响该基准旨在提高机器人操作策略在现实世界中的泛化能力，解决了机器人AI的一个关键挑战。

排序理由该集群描述了一个新的学术基准和在arXiv上发表的相关论文。

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.AI TIER_1 English(EN) · Zenan Wu, Bingqing Wei, Lu Liu, Zheqi He, Xi Wang, Jiakang Liu, Zehui Li, Guocai Yao, Jing-Shu Zheng, Xi Yang, Yongtao Wang · 2026-06-16 04:00

ATOM-Bench: A Real-World Benchmark for Atomic Skills and Compositional Generalization in Manipulation Policies

arXiv:2606.16826v1 Announce Type: cross Abstract: Generalist manipulation policies are increasingly presented as foundation models for robotic control, but their real-world generalization remains difficult to diagnose. A policy may succeed on demonstrated tasks while still failin…
arXiv cs.AI TIER_1 English(EN) · Yongtao Wang · 2026-06-15 15:08

ATOM-Bench: A Real-World Benchmark for Atomic Skills and Compositional Generalization in Manipulation Policies

Generalist manipulation policies are increasingly presented as foundation models for robotic control, but their real-world generalization remains difficult to diagnose. A policy may succeed on demonstrated tasks while still failing to execute fine-grained atomic skills or recombi…

报道来源 [2]

ATOM-Bench: A Real-World Benchmark for Atomic Skills and Compositional Generalization in Manipulation Policies

ATOM-Bench: A Real-World Benchmark for Atomic Skills and Compositional Generalization in Manipulation Policies

相关实体

相关话题