RayDer transformer scales novel view synthesis with real-world video

By PulseAugur Editorial · [3 sources] · 2026-05-29 00:00

Researchers have developed RayDer, a novel transformer model designed to improve self-supervised novel view synthesis from real-world videos. This unified model consolidates camera estimation, scene reconstruction, and rendering into a single backbone, enabling stable training on dynamic video content. RayDer demonstrates predictable power-law scaling with data and compute, achieving competitive zero-shot performance on various benchmarks. AI

IMPACT Enables more scalable and robust novel view synthesis by leveraging general video data, potentially impacting 3D reconstruction and content creation.

RANK_REASON The cluster contains an academic paper detailing a new model and its performance.

Read on Hugging Face Daily Papers →

AI-generated summary · Google Gemini · from 3 sources. How we write summaries →

RayDer transformer scales novel view synthesis with real-world video

COVERAGE [3]

arXiv cs.AI TIER_1 English(EN) · Ulrich Prestel, Stefan Andreas Baumann, Nick Stracke, Bj\"orn Ommer · 2026-06-01 04:00

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

arXiv:2605.31535v1 Announce Type: cross Abstract: Self-supervised novel view synthesis (NVS) remains challenging to scale, despite the abundance of video data, largely due to the brittleness of training on realistic videos and the hard-to-predict scaling behavior of multi-network…
arXiv cs.AI TIER_1 English(EN) · Björn Ommer · 2026-05-29 16:50

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

Self-supervised novel view synthesis (NVS) remains challenging to scale, despite the abundance of video data, largely due to the brittleness of training on realistic videos and the hard-to-predict scaling behavior of multi-network system designs. We introduce RayDer, a unified, f…
Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-29 00:00

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

RayDer is a unified feed-forward transformer that consolidates camera estimation, scene reconstruction, and rendering for self-supervised novel view synthesis, enabling stable training on real-world video through dynamic state absorption and demonstrating clean scaling behavior.

COVERAGE [3]

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World Video

RELATED TOPICS