SurgLQA: Scalable Long-Horizon Surgical Video Question Answering
Researchers have developed SurgLQA, a new framework designed for question answering within long surgical videos. This system addresses the limitations of current approaches that focus on short clips by incorporating Faithful Temporal Consolidation (FTC) to maintain temporal fidelity in long-range representations. Additionally, it features Temporally-Grounded Multi-Policy Scaling (TMS) for adaptive reasoning during inference. Experiments on a restructured colonoscopy dataset, Colon-LQA, and the REAL-Colon-VQA benchmark show improved performance in long-range surgical video analysis. AI
IMPACT Introduces a novel framework for long-horizon surgical video analysis, potentially improving clinical decision support and intraoperative interpretation.