PulseAugur
实时 20:28:47

AI systems take top spots in EgoVis 2026 challenges

Two research teams have presented technical reports for challenges at the EgoVis 2026 conference. One team, JFAA, secured first place in the EPIC-KITCHENS-100 Action Anticipation Challenge using a JEPA-based method for future action prediction. The second team, MARS, achieved second place in the CASTLE Challenge by treating the task as an agentic evidence-selection problem across multiple modalities, including video, transcripts, and sensor data, utilizing a GPT-5.4 decision agent. AI

影响 Showcases advancements in multimodal reasoning and action anticipation, potentially influencing future embodied AI research.

排序理由 Two technical reports detailing AI systems that achieved top rankings in academic challenges.

在 Hugging Face Daily Papers 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

AI systems take top spots in EgoVis 2026 challenges

报道来源 [3]

  1. Hugging Face Daily Papers TIER_1 ·

    JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026

    We propose JFAA, a JEPA-based Future Action Anticipation method for the EPIC-KITCHENS-100 (EK-100) Action Anticipation task. Inspired by the representation learning and future prediction ability of V-JEPA 2.1, JFAA uses a frozen encoder and predictor to extract observed context f…

  2. arXiv cs.CV TIER_1 · Liqiang Nie ·

    JFAA: Technical Report for the EPIC-KITCHENS-100 Action Anticipation Challenge at EgoVis 2026

    We propose JFAA, a JEPA-based Future Action Anticipation method for the EPIC-KITCHENS-100 (EK-100) Action Anticipation task. Inspired by the representation learning and future prediction ability of V-JEPA 2.1, JFAA uses a frozen encoder and predictor to extract observed context f…

  3. arXiv cs.CV TIER_1 · Liqiang Nie ·

    MARS: Technical Report for the CASTLE Challenge at EgoVis 2026

    This report presents MARS, short for Multimodal Agentic Reasoning with Source selection, our system for the CASTLE Challenge at EgoVis 2026. Participants must answer 185 closed-form questions over the CASTLE 2024 dataset. In contrast to prior single-video egocentric benchmarks, C…