PulseAugur
EN
LIVE 09:39:24

PIGEON framework uses VLM for efficient object navigation

Researchers have developed PIGEON, a new framework for object navigation in unseen indoor environments. PIGEON leverages Vision-Language Models (VLMs) by formulating navigation as a sparse decision problem, using "Points of Interest" (PoIs) to couple executable waypoints with visual observations. This approach allows VLMs to select critical PoIs, such as exploration frontiers or target objects, while low-level planners handle continuous motion. Experiments on Habitat ObjectNav benchmarks show PIGEON achieves state-of-the-art zero-shot performance and demonstrates robustness on physical robots. AI

IMPACT This framework could improve robotic navigation efficiency and adaptability in complex, unseen environments.

RANK_REASON The cluster contains a research paper detailing a new framework for object navigation. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.CV →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.CV TIER_1 English(EN) · Cheng Peng, Zhenzhe Zhang, Xiaobao Wei, Yanhao Zhang, Heng Wang, Pengwei Wang, Zhongyuan Wang, Cheng Chi, Shanghang Zhang, Jing Liu ·

    PIGEON: VLM-Driven Object Navigation via Points of Interest Selection

    arXiv:2511.13207v2 Announce Type: replace-cross Abstract: Object navigation in unseen indoor environments requires agents to perform semantic search under partial observability. Vision-language models (VLMs) provide strong semantic-spatial priors for this task, but how to interfa…