Brief · PulseAugur

RESEARCH · arXiv cs.LG English(EN) · 20h · [2 sources]

Task Robustness via Re-Labelling Vision-Action Robot Data

Researchers have developed a new framework called TREAD to improve robot learning by augmenting existing datasets. This method uses large Vision-Language Models (VLMs) to generate more diverse and linguistically rich instructions for robot tasks. By decomposing demonstrations into grounded language-action pairs and adding variations of text goals, TREAD enhances a robot's ability to understand and generalize to new instructions and scenarios. AI

IMPACT Enhances robot instruction following and generalization by leveraging VLM capabilities for data augmentation.

Vision-Language Models
LIBERO
TREAD