Brief · PulseAugur

TOOL · arXiv cs.AI English(EN) · 8h

RoboBenchMart: Benchmarking Robots in Retail Environment

Researchers have introduced RoboBenchMart, an open-source simulated benchmark designed to evaluate the performance of generalist visual-language models (VLAs) in retail environments. The benchmark simulates complex manipulation tasks involving diverse grocery items, presenting challenges such as dense clutter and varied spatial configurations. Initial evaluations of state-of-the-art models revealed significant struggles with common retail tasks, indicating that current VLAs are not yet fully generalized across different domains. The RoboBenchMart suite includes tools for procedural store generation, trajectory generation, evaluation, and baseline models to facilitate further research. AI

IMPACT Highlights current limitations of generalist VLAs in complex, real-world scenarios, guiding future research for retail automation.

Denis Shepelev
RoboBenchMart