Brief

last 24h

[2/2] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

RESEARCH · Hugging Face Daily Papers English(EN) · 5d · [13 sources]

Latent Spatial Memory for Video World Models

Researchers have introduced "ImageTime," a new benchmark designed to evaluate how well image generation models can understand and represent temporal changes. This benchmark assesses spatiotemporal consistency by requiring models to generate four ordered key states of an action, moving beyond single-image quality metrics. Separately, a new framework called BiWM has been developed to advance open-source interactive video world models using bidirectional autoregression, aiming to improve generation quality and inference speed. Another paper proposes "latent spatial memory" for video world models, storing scene information directly in the diffusion latent space to significantly speed up generation and reduce memory footprint. AI

IMPACT Advances in video world modeling benchmarks and frameworks could accelerate progress in generative AI for video and simulation.
- Mirage
- RealEstate10K
- WorldScore
- CALVIN
- Matrix-Game-3.0
- minWM
- ImageTime
- Wan2.2-5B
- LTX-2.3-22B
- Yume-1.5
- Wan2.1-1.3B
- HunyuanVideo-1.5-8B
- GPT-5.5
RESEARCH · Hugging Face Daily Papers English(EN) · 2w · [15 sources]

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation

Researchers have developed new methods to accelerate interactive video world models, which generate video content based on user camera movements. "Light Interaction" offers a training-free approach by adaptively managing context and using a denoising cache, achieving up to 2.59x speedup. Separately, the "minWM" framework provides an open-source pipeline for converting existing video diffusion models into real-time interactive world models. Additionally, a new benchmark called "WBench" has been introduced to comprehensively evaluate these interactive video world models across various dimensions. AI

IMPACT Advances in interactive video generation and world modeling could enable more realistic simulations and embodied AI training.

Brief

Latent Spatial Memory for Video World Models

WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation