SS3D pipeline enables self-supervised 3D estimation from web videos

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 2 sources

Researchers have developed SS3D, a self-supervised pretraining pipeline for estimating 3D information from monocular video. This end-to-end system jointly predicts depth, ego-motion, and intrinsics in a single pass. The pipeline is trained on a large dataset derived from YouTube-8M, demonstrating strong zero-shot transfer capabilities and improved performance over existing self-supervised methods. The team has also released the pretrained model checkpoint and associated code. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Enables more robust 3D reconstruction from monocular video, potentially improving applications in robotics and augmented reality.

RANK_REASON Academic paper detailing a new self-supervised 3D estimation pipeline.

Read on arXiv cs.CV →

paper
other

COVERAGE [2]

arXiv cs.CV TIER_1 · Marwane Hariat, Gianni Franchi, David Filliat, Antoine Manzanera · 2026-04-27 04:00

SS3D: End2End Self-Supervised 3D from Web Videos

arXiv:2604.22686v1 Announce Type: new Abstract: We present SS3D, a web-scale SfM-based self-supervision pretraining pipeline for feed-forward 3D estimation from monocular video. Our model jointly predicts depth, ego-motion, and intrinsics in a single forward pass and is trained/e…
arXiv cs.CV TIER_1 · Antoine Manzanera · 2026-04-24 16:12

SS3D: End2End Self-Supervised 3D from Web Videos

We present SS3D, a web-scale SfM-based self-supervision pretraining pipeline for feed-forward 3D estimation from monocular video. Our model jointly predicts depth, ego-motion, and intrinsics in a single forward pass and is trained/evaluated as a coherent end-to-end 3D estimator. …

COVERAGE [2]

SS3D: End2End Self-Supervised 3D from Web Videos

SS3D: End2End Self-Supervised 3D from Web Videos

RELATED ENTITIES

RELATED TOPICS