ActionMap: Robot Policy Learning via Voxel Action Heatmap
Researchers have developed ActionMap, a novel voxel heatmap action head designed to improve robot policy learning in vision-language-action (VLA) models. This new head replaces the traditional action decoder, predicting a heatmap over the action space to better exploit the geometric proximity of actions. In simulations and real-world tests, ActionMap demonstrated superior performance and data efficiency compared to existing methods, suggesting that action representation is a key factor in VLA model effectiveness. AI
IMPACT ActionMap's improved data efficiency and performance could accelerate VLA model development and real-world robot deployment.