PulseAugur
EN
LIVE 12:44:45

GeoSAM-3D enables 3D scene segmentation from monocular video

Researchers have introduced GeoSAM-3D, a novel method for segmenting objects in 3D scenes using only monocular video. This approach allows users to upload a short video, select an object in a single frame, and receive a propagated 3D mask. GeoSAM-3D achieves this by combining pre-trained image and video models with 3D Gaussian Splatting reconstruction and a unique graph-geodesic propagation kernel. AI

IMPACT Enables detailed 3D scene understanding from readily available monocular video, potentially impacting robotics and AR/VR.

RANK_REASON The cluster contains a research paper detailing a new method for 3D scene segmentation. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Arun Sharma ·

    GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

    arXiv:2606.00447v1 Announce Type: cross Abstract: Open-vocabulary 3D scene segmentation usually assumes RGB-D video, calibrated multi-view imagery, or a reconstructed mesh. GeoSAM-3D studies a lighter setting: a user uploads a short monocular video, clicks or names an object in o…