Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 1mo

SurgLQA: Scalable Long-Horizon Surgical Video Question Answering

Researchers have developed SurgLQA, a new framework designed for question answering within long surgical videos. This system addresses the limitations of current approaches that focus on short clips by incorporating Faithful Temporal Consolidation (FTC) to maintain temporal fidelity in long-range representations. Additionally, it features Temporally-Grounded Multi-Policy Scaling (TMS) for adaptive reasoning during inference. Experiments on a restructured colonoscopy dataset, Colon-LQA, and the REAL-Colon-VQA benchmark show improved performance in long-range surgical video analysis. AI

IMPACT Introduces a novel framework for long-horizon surgical video analysis, potentially improving clinical decision support and intraoperative interpretation.

Faithful Temporal Consolidation
Colon-LQA
SurgLQA
Temporally-Grounded Multi-Policy Scaling
REAL-Colon-VQA