实体
Polyak--Ruppert
Polyak--Ruppert
PulseAugur coverage of Polyak--Ruppert — every cluster mentioning Polyak--Ruppert across labs, papers, and developer communities, ranked by signal.
总计 · 30天
3
90 天内 3
发布 · 30天
0
90 天内 0
论文 · 30天
3
90 天内 3
层级分布 · 90 天
情绪 · 30 天
1 天有情绪数据
最近 · 第 1/1 页 · 共 3 条
-
新的Q学习方法实现了n^{-1/4}的高斯逼近界
研究人员开发了一种用于函数逼近的熵正则化Q学习中高斯分布逼近的新方法。该研究为异步Q学习生成的平均迭代建立了收敛速率,实现了n^{-1/4}阶的高斯逼近界。这项工作将软贝尔曼递归的线性化与主要鞅项的高斯逼近相结合,还推导了算法最终迭代的高阶矩界。
-
Researchers develop novel bootstrap for SGD confidence sets
Researchers have developed a novel method for constructing confidence sets in Stochastic Gradient Descent (SGD) algorithms. This new approach utilizes the multiplier bootstrap procedure and establishes its non-asymptoti…
-
New research identifies stabilization threshold for dynamic preconditioning in online inference
Researchers have identified a critical stabilization threshold for dynamic preconditioning in gradient descent methods. This threshold determines when the Polyak-Ruppert averaging technique, fundamental for online infer…