Brief · PulseAugur

TOOL · arXiv cs.CV English(EN) · 1w

M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction

Researchers have developed M2H-MX, a novel multi-task perception model designed for real-time 3D scene graph construction using monocular cameras. This model enhances both depth and semantic estimation by allowing these predictions to mutually reinforce each other within a lightweight decoder. When integrated into a monocular SLAM pipeline, M2H-MX significantly reduces trajectory error and produces more refined metric-semantic maps, demonstrating its effectiveness for robotic perception. AI

IMPACT Enhances real-time 3D scene understanding for robots, potentially improving navigation and interaction capabilities.

arXiv
ScanNet
M2H-MX
NYUDv2