English(EN) Sign-Separated Finite-Time Error Analysis of Q-Learning

Q-Learning 误差分析揭示过高估计动态

作者 PulseAugur 编辑部 · [1 个来源] · 2026-05-15 15:54

研究人员开发了一种使用恒定步长的新型 Q-learning 算法有限时间误差分析。该分析将误差分解为负分量和正分量，揭示负分量由与最优策略相关的稳定线性时不变系统控制。该方法识别出 Q-learning 误差动态中的不对称性，将过高估计与通过 Bellman 最大值传播的正误差联系起来。 AI

影响提供了对 Q-learning 误差动态更深入的理论理解，可能导致更强大、更高效的强化学习代理。

排序理由学术论文，详细介绍了对强化学习算法的新理论分析。[lever_c_demoted from research: ic=1 ai=1.0]

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.AI TIER_1 English(EN) · Donghwan Lee · 2026-05-15 15:54

Q-Learning 的符号分离有限时间误差分析

This paper develops a sign-separated finite-time error analysis for constant step-size Q-learning. Starting from the switching-system representation, the error is decomposed into its componentwise negative and positive parts. The negative part is dominated by a lower comparison l…