Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 2w

A KL-regularization Framework for Learning to Plan with Adaptive Priors

Researchers have introduced Policy Optimization-Model Predictive Control (PO-MPC), a new framework for model-based reinforcement learning that enhances sample efficiency in continuous control tasks. This approach unifies existing methods by integrating the planner's action distribution as a prior into policy optimization, allowing for a flexible trade-off between return maximization and KL divergence minimization. Experiments demonstrate that PO-MPC configurations advance the state-of-the-art in MPPI-based reinforcement learning. AI

IMPACT Introduces a novel framework that improves sample efficiency and performance in model-based reinforcement learning tasks.

Policy Optimization-Model Predictive Control
Model-Predictive Path Integral
Alvaro Serra-Gomez