PulseAugur
EN
LIVE 17:47:34

New Dataset EgoCS-400K Advances World Models with Counter-Strike Gameplay

Researchers have introduced EgoCS-400K, a large-scale dataset derived from Counter-Strike gameplay, designed to advance the development of interactive world models. This dataset comprises over 400,000 first-person videos and 10,000 hours of gameplay, meticulously capturing player actions, camera movements, game states, and events. EgoCS-400K aims to bridge the gap between passive video data and the complex requirements of embodied AI by providing temporally aligned video-action-language trajectories. AI

IMPACT Provides a large-scale, action-rich dataset to train world models for interactive AI agents.

RANK_REASON The cluster describes a new research dataset published on arXiv.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

New Dataset EgoCS-400K Advances World Models with Counter-Strike Gameplay

COVERAGE [3]

  1. Hugging Face Daily Papers TIER_1 English(EN) ·

    EgoCS-400K: An Egocentric Gameplay Dataset for World Models

    EgoCS-400K is a large-scale egocentric Counter-Strike dataset that bridges passive web videos and costly real-world embodied data by providing temporally aligned video-action-language trajectories with detailed player states and game events.

  2. arXiv cs.CV TIER_1 English(EN) · Rongjin Guo, Dong Liang, Yuhao Liu, Fang Liu, Tianyu Huang, Gerhard P. Hancke, Rynson W. H. Lau ·

    EgoCS-400K: An Egocentric Gameplay Dataset for World Models

    arXiv:2606.18180v1 Announce Type: new Abstract: The shift from video generation to interactive world modeling places new demands on data: beyond captioned videos, world models require temporally aligned video-action-language trajectories grounded in the actions, camera motion, st…

  3. arXiv cs.CV TIER_1 English(EN) · Rynson W. H. Lau ·

    EgoCS-400K: An Egocentric Gameplay Dataset for World Models

    The shift from video generation to interactive world modeling places new demands on data: beyond captioned videos, world models require temporally aligned video-action-language trajectories grounded in the actions, camera motion, states, and events that drive future scene changes…