Researchers have developed SurgOnAir, a novel streaming vision-language model designed for real-time surgical video commentary. Unlike previous offline methods, SurgOnAir processes video frames sequentially to generate narration tokens as visual input becomes available, enabling immediate responsiveness to surgical dynamics. The model is trained on the SurgOnAir-11k dataset, which includes hierarchical supervision for action, step, and phase levels, allowing it to produce multi-level, hierarchy-aware textual responses and explicitly mark key workflow transitions. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Enables real-time AI assistance in surgery by providing immediate, context-aware commentary on surgical procedures.
RANK_REASON The cluster contains a new academic paper detailing a novel AI model and dataset. [lever_c_demoted from research: ic=1 ai=1.0]