WorldOlympiad: Can Your World Model Survive a Triathlon?
A new benchmark called WorldOlympiad has been introduced to evaluate video-based world models. It assesses physical faithfulness, geometric consistency, and interaction fidelity, going beyond typical metrics like visual quality. The benchmark aims to reveal shortcomings in current models' ability to adhere to physical laws and maintain coherent 3D structures over extended periods. Experiments using WorldOlympiad on state-of-the-art models have exposed significant gaps in their reasoning and interaction capabilities. AI
IMPACT This benchmark could drive improvements in generative models' understanding of physics and 3D consistency, crucial for applications like robotics and gaming.