PIGEON: VLM-Driven Object Navigation via Points of Interest Selection
Researchers have developed PIGEON, a new framework for object navigation in unseen indoor environments. PIGEON leverages Vision-Language Models (VLMs) by formulating navigation as a sparse decision problem, using "Points of Interest" (PoIs) to couple executable waypoints with visual observations. This approach allows VLMs to select critical PoIs, such as exploration frontiers or target objects, while low-level planners handle continuous motion. Experiments on Habitat ObjectNav benchmarks show PIGEON achieves state-of-the-art zero-shot performance and demonstrates robustness on physical robots. AI
IMPACT This framework could improve robotic navigation efficiency and adaptability in complex, unseen environments.