English(EN) Finding the Time to Think: Learning Planning Budgets in Real-Time RL

新的强化学习方法学习实时决策的最优规划时间

作者 PulseAugur 编辑部 · [1 个来源] · 2026-06-26 04:00

研究人员开发了一种新的实时强化学习（RL）方法，以应对时间约束下的决策挑战。他们的方法包括训练一个轻量级的门控策略，以动态选择依赖于状态的规划预算，从而使智能体能够优化审议时间。该技术在包括Pac-Man、Tetris和Snake在内的多个实时游戏中进行了测试，与固定预算和启发式基线相比，表现更优。 AI

影响这项研究可能导致在时间敏感型应用中出现更高效的AI智能体，从而提高在实时环境中的性能。

排序理由该集群包含一篇详细介绍强化学习新算法的研究论文。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。我们如何撰写摘要 →

报道来源 [1]

arXiv cs.LG TIER_1 English(EN) · Aneesh Muppidi, Firas Darwish, Dylan Cope, Jo\~ao F. Henriques, Jakob Nicolaus Foerster · 2026-06-26 04:00

寻找思考的时间：在实时强化学习中学习规划预算

arXiv:2606.26463v1 Announce Type: new Abstract: Deliberating takes time. In real-time settings, that time is not free. Standard reinforcement learning (RL) sidesteps this as the environment waits indefinitely for the agent's decision. Instead, we study real-time RL environments w…

报道来源 [1]

寻找思考的时间：在实时强化学习中学习规划预算

相关实体

相关话题