Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 1mo

Value Functions for Temporal Logic: Optimal Policies and Safety Filters

Researchers have developed a new method for constructing optimal policies for temporal logic specifications in reinforcement learning. This approach builds upon existing work by decomposing value functions and creating non-Markovian policies that consider state history. The Q-function is also utilized as a safety filter for complex temporal logic tasks, extending previous capabilities beyond basic reach and avoid scenarios. AI

IMPACT Introduces a novel approach to policy optimization and safety filtering in reinforcement learning for complex temporal logic tasks.

arXiv
Bellman equations
Q-function