English(EN) M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction

新型M2H-MX模型提升单目3D场景理解能力

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-26 04:00

研究人员开发了M2H-MX，这是一种新颖的多任务感知模型，专为使用单目摄像头进行实时3D场景图构建而设计。该模型通过允许深度和语义估计在轻量级解码器内相互加强，从而增强了它们的预测能力。当集成到单目SLAM管道中时，M2H-MX显著降低了轨迹误差，并生成了更精细的度量-语义地图，证明了其在机器人感知方面的有效性。 AI

影响增强了机器人的实时3D场景理解能力，可能提高导航和交互能力。

排序理由这是一篇详细介绍新模型及其在基准测试中性能的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.CV 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.CV TIER_1 English(EN) · U. V. B. L. Udugama, George Vosselman, Francesco Nex · 2026-05-26 04:00

M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction

arXiv:2603.29236v2 Announce Type: replace Abstract: Monocular cameras are attractive for robotic perception due to their low cost and ease of deployment, yet achieving reliable real-time spatial understanding from a single image stream remains challenging. While recent multi-task…

报道来源 [1]

M2H-MX: Multi-Task Semantic and Geometric Perception for Real-Time Monocular 3D Scene Graph Construction

相关实体

相关话题