Lyapunov-Based Sample Complexity Analysis for Weakly-Coupled MDPs
Researchers have developed a novel Lyapunov-based framework to analyze the sample complexity of learning in weakly-coupled Markov decision processes (WCMDPs) and Restless Bandits (RBs). This approach offers a more efficient method for learning near-optimal policies compared to naive reductions, achieving polynomial sample and computational complexities. The framework establishes finite-sample PAC guarantees with improved optimality gaps and introduces a fine-grained perturbation analysis for linear programming relaxations as a key technical contribution. AI
IMPACT Introduces a novel theoretical framework that could lead to more efficient AI learning algorithms for sequential decision-making problems.