PulseAugur
EN
LIVE 08:43:19
ENTITY Vision-language-action model

Vision-language-action model

PulseAugur coverage of Vision-language-action model — every cluster mentioning Vision-language-action model across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
33
33 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
24
24 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
SENTIMENT · 30D

8 day(s) with sentiment data

RECENT · PAGE 1/2 · 33 TOTAL
  1. TOOL · CL_108134 ·

    DriveStack-VLA enhances driving models with spatial intelligence and self-critique

    Researchers have introduced DriveStack-VLA, a novel framework designed to enhance the spatial intelligence of vision-language-action driving models. This system leverages a large vision-language model backbone and incor…

  2. RESEARCH · CL_107766 ·

    New G3VLA module enhances robot manipulation VLA models with geometric awareness

    Researchers have introduced G$^3$VLA, a novel module designed to enhance Vision-Language-Action (VLA) models for robot manipulation. This module addresses the mismatch between 2D image coordinates and the calibrated geo…

  3. RESEARCH · CL_107697 ·

    New framework enables robots to adapt to new environments without retraining

    Researchers have introduced In-Context World Modeling (ICWM), a new framework designed to improve the adaptability of robotic policies. ICWM treats system identification as an in-context adaptation problem, enabling rob…

  4. TOOL · CL_105106 ·

    New RECALL method improves VLA model learning with active data collection

    Researchers have introduced RECALL, a novel approach to active lifelong learning for Vision-Language-Action (VLA) models. Unlike passive imitation learning, which requires failures to trigger data collection and offers …

  5. RESEARCH · CL_106805 ·

    New methods enhance VLA model efficiency and performance in robotics · 9 sources tracked

    Researchers are developing new methods to improve the efficiency and performance of Vision-Language-Action (VLA) models in robotics. One approach, Flow Policy Optimization (FPO), uses reinforcement learning to fine-tune…

  6. TOOL · CL_106618 ·

    New protocol measures commonsense knowledge in VLA models

    Researchers have developed Act2Answer, a new evaluation protocol designed to assess the commonsense and world knowledge retained by Vision-Language-Action (VLA) models after fine-tuning on robotics data. This protocol a…

  7. RESEARCH · CL_93885 ·

    Vision-language models lack agency and knowledge retention, new papers reveal

    Two new research papers highlight limitations in current vision-language models (VLMs), particularly concerning their ability to retain knowledge after fine-tuning and their lack of "agency" in visual reasoning. The fir…

  8. TOOL · CL_93220 ·

    New QPILOTS method enhances reinforcement learning for diffusion policies

    Researchers have introduced QPILOTS, a novel method designed to improve the efficiency of reinforcement learning (RL) for flow-matching and diffusion policies. This technique steers the denoising process at inference ti…

  9. TOOL · CL_93215 ·

    ScoutVLA Model Enhances UAV Question Answering with Active Perception

    Researchers have introduced ScoutVLA, a novel dual-expert vision-language-action model designed for aerial embodied question answering. This model addresses the limitations of existing systems by enabling unmanned aeria…

  10. RESEARCH · CL_95767 ·

    Egocentric human video outperforms robot data for embodied AI pretraining

    Researchers have found that egocentric human video can be a more effective and cost-efficient data source for pretraining embodied foundation models compared to traditional teleoperated robot trajectories. Studies indic…

  11. TOOL · CL_92089 ·

    New APT method boosts VLA model generalization with action expert pretraining

    Researchers have developed a new method called APT (Action Expert Pretraining) to improve the generalization capabilities of Vision-Language-Action (VLA) models. These models, which combine vision-language understanding…

  12. TOOL · CL_56796 ·

    NUS develops FD-VLA model for enhanced robotic manipulation

    Researchers from the National University of Singapore have developed FD-VLA, a novel Vision-Language-Action (VLA) model designed to improve robotic manipulation in contact-rich tasks. Unlike previous VLA models that pri…

  13. TOOL · CL_28211 ·

    Embodied AI redefines computer vision's role at CVPR 2026

    Embodied AI is shifting the focus of computer vision research, moving from understanding static images to enabling intelligent agents to interact with and manipulate the real world. This paradigm shift, evident at CVPR …

  14. COMMENTARY · CL_23975 ·

    NVIDIA's Jim Fan: VLA and remote operation dead, World Action Models rise

    NVIDIA's Jim Fan declared the end of Visual-Language-Action (VLA) models and remote operation in robotics, advocating for World Action Models (WAM) as the new paradigm. Fan proposed that WAMs, inspired by Large Language…

  15. COMMENTARY · CL_22327 ·

    Airbnb revenue up 18% to $2.7B, while XPeng's AI driving usage surges

    Airbnb reported a 18% year-over-year increase in first-quarter revenue, reaching $2.7 billion. The company also achieved a net profit of $160 million and adjusted EBITDA of $519 million, a 24% increase. Separately, Xpen…

  16. TOOL · CL_22328 ·

    Xpeng's second-gen VLA system achieves over 50% autonomous driving mileage

    XPeng's second-generation VLA system has achieved over 50% of its intelligent driving mileage within a month of its rollout. During the May Day holiday, the AI-assisted driving system saw a daily usage rate of 93.21%, a…

  17. RESEARCH · CL_21835 ·

    MobileEgo Anywhere releases 200-hour egocentric dataset for VLA models

    Researchers have introduced MobileEgo Anywhere, a framework and dataset designed to collect extensive egocentric data using commodity mobile devices. This initiative aims to overcome the limitations of existing datasets…

  18. TOOL · CL_18860 ·

    AhaRobot: Low-cost open-source bimanual manipulator for embodied AI

    Researchers have developed AhaRobot, a low-cost, open-source bimanual mobile manipulator designed to facilitate embodied AI research. The system features a novel SCARA-like dual-arm design for reduced motor torque and a…

  19. TOOL · CL_15563 ·

    New attack method targets Transformer vulnerabilities in autonomous driving systems

    Researchers have developed a new gray-box attack framework called Adversarial Flow Matching (AFM) that targets vulnerabilities in Transformer modules used by end-to-end autonomous driving systems. AFM can generate visua…

  20. RESEARCH · CL_13053 ·

    VLA emerges as top solution for embodied AI, despite sensory limitations

    Visual-Language-Action (VLA) models are currently the leading architecture for embodied AI due to their strong task generalization capabilities. However, VLA has limitations, particularly in tactile and proprioceptive s…