English(EN) A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown L\'evy Process Dynamics

新的强化学习框架用莱维过程模拟稀有事件

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-24 04:00

研究人员开发了一种新的基于模型的强化学习连续时间策略评估框架。该方法同时考虑了布朗噪声和莱维噪声，这对于模拟稀有和极端事件至关重要。该方法涉及求解一个复杂的偏微分积-微分方程，并包含一种新颖的迭代尾部校正机制，以准确恢复随机动力学中未知的系数，特别是那些由重尾莱维过程驱动的系数。通过数值实验，包括对真实世界比特币价格数据的分析，证明了这种鲁棒数值方法的有效性。 AI

影响引入了一种处理强化学习中复杂随机动力学的新方法，有可能提高智能体在具有稀有但有影响力的事件环境中的性能。

排序理由详细介绍强化学习新方法的学术论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

Qihao Ye

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Qihao Ye, Xiaochuan Tian, Yuhua Zhu · 2026-06-24 04:00

A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown L\'evy Process Dynamics

arXiv:2504.01482v3 Announce Type: replace-cross Abstract: This paper develops a model-based framework for continuous-time policy evaluation (CTPE) in reinforcement learning, incorporating both Brownian and L\'evy noise to model stochastic dynamics influenced by rare and extreme e…

报道来源 [1]

A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown L\'evy Process Dynamics

相关话题