Plan-R1: Safe and Feasible Trajectory Planning as Language Modeling
Researchers have developed Plan-R1, a novel two-stage framework for trajectory planning in autonomous driving that leverages large language models. This approach first pre-trains a general trajectory predictor on expert data to learn human-like behaviors, then fine-tunes it using rule-based rewards for safety and compliance. A key innovation is Variance-Decoupled GRPO, which addresses limitations in existing optimization methods to ensure safety-critical objectives remain prioritized during training. Experiments on the nuPlan benchmark show Plan-R1 achieves state-of-the-art performance, particularly in realistic reactive scenarios. AI
IMPACT Enhances safety and feasibility in autonomous driving, potentially accelerating real-world deployment.