VISTA navigation model uses action history to improve robot generalization

By PulseAugur Editorial · [1 sources] · 2026-06-17 04:00

Researchers have introduced VISTA, a novel approach to visual navigation that addresses the vulnerability of normalized actions in Vision Navigation Foundation Models (VNMs). By conditioning the model on normalized action histories, VISTA provides explicit context on the relationship between predictions and physical displacement, mitigating performance degradation and collision risks. The model also integrates a DINOv3 encoder to better handle visually repetitive environments by capturing spatial and geometric dimensions. VISTA demonstrates robust generalization, achieving 100% goal prediction accuracy in zero-shot real-world deployments across outdoor, forest, and office settings, with an average of 95% checkpoints crossed. AI

IMPACT Enhances robot navigation robustness by conditioning on action history, improving generalization in diverse environments.

RANK_REASON The cluster contains an academic paper detailing a new model and its performance. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

arXiv cs.LG TIER_1 English(EN) · Maeva Guerrier, Koki Kobayashi, Simon Roy, Jana Pavlasek, Giovanni Beltrame · 2026-06-17 04:00

VISTA: Scale-Aware Visual Navigation via Action History Conditioning

arXiv:2606.17294v1 Announce Type: cross Abstract: Vision Navigation Foundation Models (VNMs) promise end-to-end learned navigation policies capable of zero-shot deployment across diverse embodiments and environments. To maintain generality, many vision-based navigation models pre…

COVERAGE [1]

VISTA: Scale-Aware Visual Navigation via Action History Conditioning

RELATED ENTITIES

RELATED TOPICS