Just now, the world's first 'event-level prediction' embodied intelligent world model has arrived!
X-Square Robot has introduced WALL-WM, a novel embodied AI world model that predicts actions based on semantic events rather than sequential frames. This approach allows robots to focus on task objectives, like grasping a cup, by imagining the outcome and generating actions accordingly. The model utilizes a multi-layered architecture for event prediction and can operate in two distinct reasoning modes, offering flexibility for various robotic applications. WALL-WM also incorporates advanced multi-view fusion and a novel decoding method for interpretable and real-time robotic control. AI
IMPACT Enables robots to grasp task objectives more effectively by focusing on semantic events rather than frame-by-frame prediction.