English(EN) Laplacian Representations for Decision-Time Planning

新的拉普拉斯表示增强了强化学习规划

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-03 04:00

研究人员为决策时规划（ALPS）引入了拉普拉斯表示，这是一种专为基于模型的强化学习设计的新型分层规划算法。ALPS 利用拉普拉斯表示来捕捉多个时间尺度的状态空间距离，有效地将长时域问题分解为子目标并减少累积误差。该算法在 OGBench 基准测试的离线目标条件强化学习任务上表现出色，优于先前占主导地位的无模型方法。 AI

影响引入了一种新颖的强化学习规划方法，有望提高智能体在复杂、长时域任务上的性能。

排序理由该集群包含一篇详细介绍新算法和基准测试结果的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Dikshant Shehmar, Matthew Schlegel, Matthew E. Taylor, Marlos C. Machado · 2026-06-03 04:00

Laplacian Representations for Decision-Time Planning

arXiv:2602.05031v2 Announce Type: replace Abstract: Planning with a learned model remains a key challenge in model-based reinforcement learning (RL). In decision-time planning, state representations are critical as they must support local cost computation while preserving long-ho…

报道来源 [1]

Laplacian Representations for Decision-Time Planning

相关实体

相关话题