M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction
Researchers have developed M2H-MX, a novel multi-task perception model designed for real-time 3D scene graph construction using monocular cameras. This model enhances both depth and semantic estimation by allowing these predictions to mutually reinforce each other within a lightweight decoder. When integrated into a monocular SLAM pipeline, M2H-MX significantly reduces trajectory error and produces more refined metric-semantic maps, demonstrating its effectiveness for robotic perception. AI
IMPACT Enhances real-time 3D scene understanding for robots, potentially improving navigation and interaction capabilities.