SurgOnAir model provides real-time surgical video commentary

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

Researchers have developed SurgOnAir, a novel streaming vision-language model designed for real-time surgical video commentary. Unlike previous offline methods, SurgOnAir processes video frames sequentially to generate narration tokens as visual input becomes available, enabling immediate responsiveness to surgical dynamics. The model is trained on the SurgOnAir-11k dataset, which includes hierarchical supervision for action, step, and phase levels, allowing it to produce multi-level, hierarchy-aware textual responses and explicitly mark key workflow transitions. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

IMPACT Enables real-time AI assistance in surgery by providing immediate, context-aware commentary on surgical procedures.

RANK_REASON The cluster contains a new academic paper detailing a novel AI model and dataset. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

COVERAGE [1]

arXiv cs.CV TIER_1 · Yuan Bi · 2026-05-20 13:04

SurgOnAir: Hierarchy-Aware Real-Time Surgical Video Commentary

Understanding surgical workflow in real time is fundamental for intelligent surgical embodiment, where AI systems continuously perceive and respond as surgery proceeds. In the operating room, critical decisions depend on subtle, moment-to-moment changes, such as fine instrument m…

COVERAGE [1]

SurgOnAir: Hierarchy-Aware Real-Time Surgical Video Commentary

RELATED TOPICS