Ego2World: Compiling Egocentric Cooking Videos into Executable Worlds for Belief-State Planning
Researchers have introduced Ego2World, a new benchmark designed to evaluate embodied agents' planning capabilities in realistic, partially observable environments. This benchmark transforms egocentric cooking videos into executable symbolic worlds, forcing agents to plan and replan based on limited observations and execution feedback. Experiments indicate that traditional evaluation metrics may overestimate performance, and that maintaining a persistent belief memory is crucial for successful task completion in such complex scenarios. AI
IMPACT Introduces a novel benchmark for evaluating embodied agents, potentially improving their real-world planning and memory capabilities.