Plan, Watch, Recover: A Benchmark and Architectures for Proactive Procedural Assistance
Researchers have introduced EgoProactive, a new dataset and benchmark suite called Pro extsuperscript{2}Bench, designed to evaluate proactive procedural assistance systems. These systems aim to provide real-time, step-by-step guidance for tasks, including autonomously deciding when to interrupt and how to coach users. The benchmark incorporates explicit annotations for out-of-plan deviations and recovery steps, addressing a key limitation in existing datasets. The proposed decoupled planner-interaction architecture, when trained on models like Llama 4 and Qwen-3.6-VL, demonstrated superior performance over proprietary and open-weight baselines in extensive experiments. AI
IMPACT Establishes a new benchmark for AI procedural assistance, potentially improving user guidance systems and agent capabilities.