PulseAugur
EN
LIVE 16:23:30
中文(ZH) ICRA 2026|美团&北航提出LIBERO-X:五级递进测试揭示VLA模型鲁棒性短板

Meituan and Beihang University propose LIBERO-X benchmark for VLA model robustness

Researchers from Meituan and Beijing University of Aeronautics and Astronautics have introduced LIBERO-X, a new benchmark designed to rigorously test the robustness of Vision-Language-Action (VLA) models. Unlike previous benchmarks that focused on average success rates, LIBERO-X employs a five-level progressive testing protocol to simulate real-world deployment challenges such as object repositioning, scene changes, novel objects, visual interference, and instruction rewrites. Experiments revealed that prominent VLA models exhibit significant performance degradation on LIBERO-X as difficulty increases, particularly in scenarios involving topological changes, unseen objects, and semantic instruction variations, highlighting a gap in their ability to generalize under distribution shifts. AI

IMPACT This benchmark will push the development of more robust VLA models capable of handling real-world complexities and distribution shifts.

RANK_REASON The cluster describes a new research paper proposing a novel benchmark for evaluating AI models. [lever_c_demoted from research: ic=1 ai=1.0]

Read on 雷峰网 (Leiphone) →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Meituan and Beihang University propose LIBERO-X benchmark for VLA model robustness

COVERAGE [1]

  1. 雷峰网 (Leiphone) TIER_1 中文(ZH) ·

    ICRA 2026 | Meituan & Beihang University Propose LIBERO-X: Five-Level Progressive Testing Reveals Shortcomings in VLA Model Robustness

    <p style="text-align: left; margin-top: 0; margin-bottom: 0;"><br /></p><p style="text-align: left; margin-top: 0; margin-bottom: 0;"><span style="font-size: 12pt; color: #auto;">原文作者:公众号“计算机顶会大全”</span></p><p style="text-align: left; margin-top: 0; margin-bottom: 0;"><span style…