Brief · PulseAugur

RESEARCH · arXiv cs.CV English(EN) · 1d · [2 sources]

PROSE: Training-Free Egocentric Scene Registration with Vision-Language Models

Researchers have developed PROSE, a novel method for registering egocentric RGB sequences without requiring training or depth sensors. PROSE leverages pre-trained vision-language models to create object-level 3D scene graphs and match object instances across different captures. This approach demonstrates superior performance on the Aria Digital Twin and Aria Everyday Activities benchmarks compared to existing geometric and learned scene-graph methods. AI

IMPACT This method could enable more robust spatial memory for robots and AR systems by improving egocentric scene registration.

Hugging Face
arXiv
DagsHub
alphaXiv
CORE Recommender
ScienceCast
CatalyzeX
Connected Papers
Litmaps
scite Smart Citations
Gotit.pub
Aria Digital Twin
PROSE
Influence Flower
Aria Everyday Activities