ENTITY Vision-language-action model

Vision-language-action model

PulseAugur coverage of Vision-language-action model — every cluster mentioning Vision-language-action model across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

33 over 90d

Releases · 30d

0 over 90d

Papers · 30d

24 over 90d

TIER MIX · 90D

research 11
tool 19
commentary 3

TOPICS

RELATIONSHIPS

SENTIMENT · 30D

8 day(s) with sentiment data

RECENT · PAGE 1/2 · 33 TOTAL

TOOL · CL_108134 · Jun 24 · 04:00

DriveStack-VLA enhances driving models with spatial intelligence and self-critique

Researchers have introduced DriveStack-VLA, a novel framework designed to enhance the spatial intelligence of vision-language-action driving models. This system leverages a large vision-language model backbone and incor…
RESEARCH · CL_107766 · Jun 23 · 12:02

New G3VLA module enhances robot manipulation VLA models with geometric awareness

Researchers have introduced G$^3$VLA, a novel module designed to enhance Vision-Language-Action (VLA) models for robot manipulation. This module addresses the mismatch between 2D image coordinates and the calibrated geo…
RESEARCH · CL_107697 · Jun 23 · 00:00

New framework enables robots to adapt to new environments without retraining

Researchers have introduced In-Context World Modeling (ICWM), a new framework designed to improve the adaptability of robotic policies. ICWM treats system identification as an in-context adaptation problem, enabling rob…
TOOL · CL_105106 · Jun 22 · 17:12

New RECALL method improves VLA model learning with active data collection

Researchers have introduced RECALL, a novel approach to active lifelong learning for Vision-Language-Action (VLA) models. Unlike passive imitation learning, which requires failures to trigger data collection and offers …
RESEARCH · CL_106805 · Jun 18 · 00:00

New methods enhance VLA model efficiency and performance in robotics · 9 sources tracked

Researchers are developing new methods to improve the efficiency and performance of Vision-Language-Action (VLA) models in robotics. One approach, Flow Policy Optimization (FPO), uses reinforcement learning to fine-tune…
TOOL · CL_106618 · Jun 17 · 17:20

New protocol measures commonsense knowledge in VLA models

Researchers have developed Act2Answer, a new evaluation protocol designed to assess the commonsense and world knowledge retained by Vision-Language-Action (VLA) models after fine-tuning on robotics data. This protocol a…
RESEARCH · CL_93885 · Jun 16 · 04:00

Vision-language models lack agency and knowledge retention, new papers reveal

Two new research papers highlight limitations in current vision-language models (VLMs), particularly concerning their ability to retain knowledge after fine-tuning and their lack of "agency" in visual reasoning. The fir…
TOOL · CL_93220 · Jun 16 · 04:00

New QPILOTS method enhances reinforcement learning for diffusion policies

Researchers have introduced QPILOTS, a novel method designed to improve the efficiency of reinforcement learning (RL) for flow-matching and diffusion policies. This technique steers the denoising process at inference ti…
TOOL · CL_93215 · Jun 16 · 04:00

ScoutVLA Model Enhances UAV Question Answering with Active Perception

Researchers have introduced ScoutVLA, a novel dual-expert vision-language-action model designed for aerial embodied question answering. This model addresses the limitations of existing systems by enabling unmanned aeria…
RESEARCH · CL_95767 · Jun 15 · 00:00

Egocentric human video outperforms robot data for embodied AI pretraining

Researchers have found that egocentric human video can be a more effective and cost-efficient data source for pretraining embodied foundation models compared to traditional teleoperated robot trajectories. Studies indic…
TOOL · CL_92089 · Jun 10 · 00:00

New APT method boosts VLA model generalization with action expert pretraining

Researchers have developed a new method called APT (Action Expert Pretraining) to improve the generalization capabilities of Vision-Language-Action (VLA) models. These models, which combine vision-language understanding…
TOOL · CL_56796 · May 28 · 07:56

NUS develops FD-VLA model for enhanced robotic manipulation

Researchers from the National University of Singapore have developed FD-VLA, a novel Vision-Language-Action (VLA) model designed to improve robotic manipulation in contact-rich tasks. Unlike previous VLA models that pri…
TOOL · CL_28211 · May 12 · 07:48

Embodied AI redefines computer vision's role at CVPR 2026

Embodied AI is shifting the focus of computer vision research, moving from understanding static images to enabling intelligent agents to interact with and manipulate the real world. This paradigm shift, evident at CVPR …
COMMENTARY · CL_23975 · May 9 · 06:24

NVIDIA's Jim Fan: VLA and remote operation dead, World Action Models rise

NVIDIA's Jim Fan declared the end of Visual-Language-Action (VLA) models and remote operation in robotics, advocating for World Action Models (WAM) as the new paradigm. Fan proposed that WAMs, inspired by Large Language…
COMMENTARY · CL_22327 · May 8 · 05:42

Airbnb revenue up 18% to $2.7B, while XPeng's AI driving usage surges

Airbnb reported a 18% year-over-year increase in first-quarter revenue, reaching $2.7 billion. The company also achieved a net profit of $160 million and adjusted EBITDA of $519 million, a 24% increase. Separately, Xpen…
TOOL · CL_22328 · May 8 · 05:18

Xpeng's second-gen VLA system achieves over 50% autonomous driving mileage

XPeng's second-generation VLA system has achieved over 50% of its intelligent driving mileage within a month of its rollout. During the May Day holiday, the AI-assisted driving system saw a daily usage rate of 93.21%, a…
RESEARCH · CL_21835 · May 7 · 09:55

MobileEgo Anywhere releases 200-hour egocentric dataset for VLA models

Researchers have introduced MobileEgo Anywhere, a framework and dataset designed to collect extensive egocentric data using commodity mobile devices. This initiative aims to overcome the limitations of existing datasets…
TOOL · CL_18860 · May 6 · 04:00

AhaRobot: Low-cost open-source bimanual manipulator for embodied AI

Researchers have developed AhaRobot, a low-cost, open-source bimanual mobile manipulator designed to facilitate embodied AI research. The system features a novel SCARA-like dual-arm design for reduced motor torque and a…
TOOL · CL_15563 · May 5 · 04:00

New attack method targets Transformer vulnerabilities in autonomous driving systems

Researchers have developed a new gray-box attack framework called Adversarial Flow Matching (AFM) that targets vulnerabilities in Transformer modules used by end-to-end autonomous driving systems. AFM can generate visua…
RESEARCH · CL_13053 · May 2 · 13:01

VLA emerges as top solution for embodied AI, despite sensory limitations

Visual-Language-Action (VLA) models are currently the leading architecture for embodied AI due to their strong task generalization capabilities. However, VLA has limitations, particularly in tactile and proprioceptive s…

DriveStack-VLA enhances driving models with spatial intelligence and self-critique

New G3VLA module enhances robot manipulation VLA models with geometric awareness

New framework enables robots to adapt to new environments without retraining

New RECALL method improves VLA model learning with active data collection

New methods enhance VLA model efficiency and performance in robotics · 9 sources tracked

New protocol measures commonsense knowledge in VLA models

Vision-language models lack agency and knowledge retention, new papers reveal

New QPILOTS method enhances reinforcement learning for diffusion policies

ScoutVLA Model Enhances UAV Question Answering with Active Perception

Egocentric human video outperforms robot data for embodied AI pretraining

New APT method boosts VLA model generalization with action expert pretraining

NUS develops FD-VLA model for enhanced robotic manipulation

Embodied AI redefines computer vision's role at CVPR 2026

NVIDIA's Jim Fan: VLA and remote operation dead, World Action Models rise

Airbnb revenue up 18% to $2.7B, while XPeng's AI driving usage surges

Xpeng's second-gen VLA system achieves over 50% autonomous driving mileage

MobileEgo Anywhere releases 200-hour egocentric dataset for VLA models

AhaRobot: Low-cost open-source bimanual manipulator for embodied AI

New attack method targets Transformer vulnerabilities in autonomous driving systems

VLA emerges as top solution for embodied AI, despite sensory limitations