MiniGrid
PulseAugur coverage of MiniGrid — every cluster mentioning MiniGrid across labs, papers, and developer communities, ranked by signal.
1 天有情绪数据
-
新的CIG奖励方法增强了强化学习的探索能力
研究人员推出了一种新颖的强化学习奖励机制——条件信息增益(CIG),旨在改进探索策略。CIG通过提供轨迹级别信息增益的可行替代方案,解决了现有方法的局限性,使其能够扩展到高维状态空间。在离散和连续控制环境的十二项任务中进行了测试,CIG在存在随机干扰因素的情况下,与之前的探索技术相比,表现出具有竞争力或更优越的性能。
-
New Gradient-Momentum Coupling metric enhances reinforcement learning progress measurement
Researchers have introduced Gradient-Momentum Coupling (GMC), a novel method for measuring learning progress in reinforcement learning. GMC quantifies the utility of a sample's gradient for ongoing learning by analyzing…
-
PACE method improves reinforcement learning generalization via parameter change evaluation
Researchers have introduced PACE, a novel approach to Unsupervised Environment Design (UED) for enhancing reinforcement learning generalization. PACE directly measures an environment's value by assessing the policy para…