English(EN) Hierarchical Behaviour Spaces

新的分层行为空间方法增强了强化学习的探索能力

作者 PulseAugur 编辑部 · [2 个来源] · 2026-04-27 14:47

研究人员推出了一种新的分层强化学习方法——分层行为空间（HBS），该方法利用奖励函数的线性组合来创建更广泛的行为空间。与传统的每个选项单一奖励函数相比，这种方法允许更具表现力的策略表示。在NetHack学习环境上的实验表明，HBS取得了强劲的性能，其优势归因于增强的探索能力而非长期推理。 AI

影响引入了一种新的分层强化学习方法，可能会改善复杂环境中的探索策略。

排序理由这是一篇详细介绍分层强化学习新方法的学术论文。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Michael Tryfan Matthews, Anssi Kanervisto, Jakob Foerster, Pierluca D'Oro, Scott Fujimoto, Mikael Henaff · 2026-04-28 04:00

Hierarchical Behaviour Spaces

arXiv:2604.24558v1 Announce Type: cross Abstract: Recent work in hierarchical reinforcement learning has shown success in scaling to billions of timesteps when learning over a set of predefined option reward functions. We show that, instead of using a single reward function per o…
arXiv cs.AI TIER_1 English(EN) · Mikael Henaff · 2026-04-27 14:47

Hierarchical Behaviour Spaces

Recent work in hierarchical reinforcement learning has shown success in scaling to billions of timesteps when learning over a set of predefined option reward functions. We show that, instead of using a single reward function per option, the reward functions can be effectively use…