Value Functions for Temporal Logic: Optimal Policies and Safety Filters
Researchers have developed a new method for constructing optimal policies for temporal logic specifications in reinforcement learning. This approach builds upon existing work by decomposing value functions and creating non-Markovian policies that consider state history. The Q-function is also utilized as a safety filter for complex temporal logic tasks, extending previous capabilities beyond basic reach and avoid scenarios. AI
IMPACT Introduces a novel approach to policy optimization and safety filtering in reinforcement learning for complex temporal logic tasks.