SCANNET
PulseAugur coverage of SCANNET — every cluster mentioning SCANNET across labs, papers, and developer communities, ranked by signal.
6 day(s) with sentiment data
-
MMD-SLAM enhances Visual SLAM with structure-guided Gaussian mapping
Researchers have introduced MMD-SLAM, a novel Visual SLAM framework designed to enhance mapping quality and tracking robustness by incorporating structural information. This new system leverages the Atlanta World assump…
-
New system Savvy tackles open-world video segmentation challenges
Researchers have introduced Savvy, a new system designed for open-world video segmentation, which addresses the challenges of object discovery and identity maintenance in long, dynamic videos. To better evaluate such sy…
-
Pano3D framework unifies 3D reconstruction and panoptic segmentation
Researchers have developed Pano3D, a novel framework that unifies 3D reconstruction and 3D panoptic segmentation. By augmenting existing 3D reconstruction models with a set-based mask decoder and employing a joint geome…
-
New method matches 2D polygons for pose estimation
Researchers have introduced a novel Zero-shot Polygon Matching paradigm with Pre-trained Models (Z(PM)2) to address the challenges of matching 2D polygons in stereo imagery. This method leverages pre-trained models like…
-
Robots improve map accuracy with calibrated foundation model data
Researchers have developed a new method to improve the reliability of semantic information integrated into robotic mapping systems. This approach calibrates the per-class reliability of foundation model claims and imple…
-
Robust Dreamer improves AR video generation with new memory techniques
Researchers have developed Robust Dreamer, a new framework designed to improve action-controlled AR video generation. The system addresses challenges like visual drift and 3D inconsistency in long autoregressive sequenc…
-
New DA-FSS model improves multimodal few-shot 3D point cloud segmentation
Researchers have introduced a new model called DA-FSS to improve few-shot 3D point cloud segmentation. This model addresses the "Plasticity-Stability Dilemma" and CLIP's inter-class confusion by decoupling semantic and …
-
New Transformers Enhance 3D Scene Reconstruction and Edge Deployment
Researchers have developed new transformer-based models for 3D scene reconstruction from visual inputs. DVGT, a Driving Visual Geometry Transformer, reconstructs dense 3D point maps from unposed multi-view images withou…
-
New VLM framework boosts 3D view planning with self-exploration
Researchers have developed a new framework to improve the view planning capabilities of Vision-Language Models (VLMs) in 3D environments. The proposed method alternates self-exploration with view graph distillation, whe…
-
ESAM++ offers efficient 3D perception for edge devices
Researchers have developed ESAM++, an efficient method for real-time 3D scene perception on edge devices. This new approach addresses the computational demands of previous methods like ESAM by introducing a lightweight …
-
New Gaussian-Voxel Duet improves 3D surface reconstruction
Researchers have developed a novel hybrid representation called Gaussian-Voxel Duet to improve monocular surface reconstruction. This method combines 3D Gaussian Splatting with a sparse voxel scaffold, confining Gaussia…
-
New M2H-MX model boosts monocular 3D scene understanding
Researchers have developed M2H-MX, a novel multi-task perception model designed for real-time 3D scene graph construction using monocular cameras. This model enhances both depth and semantic estimation by allowing these…
-
New AI Models Advance 3D Shape Completion and Depth Estimation
Researchers have introduced several new models for 3D shape completion and depth estimation. The Large Depth Completion Model (LDCM) uses a transformer to generate dense depth maps from sparse observations, outperformin…
-
Cambrian-P video model uses camera pose for improved spatial reasoning
Researchers have introduced Cambrian-P, a novel video multimodal large language model (MLLM) that incorporates camera pose information. This approach treats video frames not as isolated images but as part of a continuou…
-
New MIND framework tackles model-induced label noise
Researchers have introduced MIND, a novel framework designed to tackle model-induced label noise in machine learning. This noise arises from the inherent biases of pre-trained models used for data annotation, leading to…
-
Invaria encoder learns scale and density invariance for 3D point clouds
Researchers have developed a new point cloud encoder called Invaria, designed to overcome the sensitivity of current 3D models to changes in scale and density. Unlike image encoders, 3D models often struggle with genera…
-
New frameworks tackle open-vocabulary 3D scene graph generation
Two new research papers introduce novel frameworks for generating open-vocabulary 3D scene graphs. The first, RelWitness, addresses incomplete supervision by using visual-geometric cues to verify relations between objec…
-
EvObj advances unsupervised 3D instance segmentation with domain adaptation
Researchers have developed EvObj, a novel approach for unsupervised 3D instance segmentation that overcomes the domain gap between synthetic and real-world data. The method employs an object discerning module to adapt o…
-
New systems map and align 3D scene graphs using RGB cameras
Researchers have developed new methods for creating 3D scene graphs, which are crucial for robot navigation and understanding. LEXI-SG, a novel system, enables dense monocular visual mapping using only RGB camera input,…
-
New FSTM method efficiently learns indoor 3D geometry and semantics
Researchers have developed a new method called FSTM for indoor 3D reconstruction that efficiently learns both geometry and semantics. This approach first optimizes geometry using RGB inputs and geometric cues, then esti…