English(EN) Beyond the Independence Assumption: Finite-Sample Guarantees for Deep Q-Learning under $τ$-Mixing

新研究挑战深度Q学习算法中的独立性假设

作者 PulseAugur 编辑部 · [2 个来源] · 2026-05-07 14:52

研究人员开发了一种新的深度Q网络（DQN）统计分析方法，该方法考虑了训练数据中的时间依赖性。这种方法将小批量数据建模为$\tau$-混合，超越了通常的独立性假设。研究结果表明，时间依赖性可以通过引入维度惩罚来降低学习的统计速率，从而有效降低样本量。 AI

影响提供了对深度强化学习算法更准确的理论理解，可能导致更鲁棒的训练方法。

排序理由这是一篇发表在arXiv上的研究论文，详细介绍了一种机器学习算法的新理论框架和经验验证。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Leon Halgryn (University of Twente), Sophie Langer (Ruhr-Universit\"at Bochum), Janusz M. Meylahn (University of Twente), E. Moritz Hahn (University of Twente) · 2026-05-08 04:00

超越独立性假设：$\tau$-混合下深度Q学习的有限样本保证

arXiv:2605.06373v1 Announce Type: cross Abstract: Finite-sample analyses of deep Q-learning typically treat replayed data as independent, even though it is sampled from temporally dependent state-action trajectories. We study the Deep Q-networks (DQN) algorithm under explicit dep…
arXiv stat.ML TIER_1 English(EN) · E. Moritz Hahn · 2026-05-07 14:52

超越独立性假设：$τ$-混合下深度Q学习的有限样本保证

Finite-sample analyses of deep Q-learning typically treat replayed data as independent, even though it is sampled from temporally dependent state-action trajectories. We study the Deep Q-networks (DQN) algorithm under explicit dependence by modelling the minibatches used for upda…