D4RL
PulseAugur coverage of D4RL — every cluster mentioning D4RL across labs, papers, and developer communities, ranked by signal.
4 day(s) with sentiment data
-
New ROMI method advances offline reinforcement learning, outperforming prior models
Researchers have introduced ROMI, a novel method for model-based offline reinforcement learning that addresses key challenges in adversarial model learning. Unlike previous approaches like RAMBO, which struggled with co…
-
New research enhances diffusion models for robust RL and safe planning
Researchers are developing new methods to improve the robustness and safety of diffusion models in reinforcement learning and planning tasks. One approach, Robust Regularized Policy Iteration (RRPI), addresses transitio…
-
New MPDiffuser framework enhances diffusion model control for robotics
Researchers have developed a new framework called Model Predictive Diffuser (MPDiffuser) to improve the reliability of diffusion models in offline decision-making tasks. This approach combines a diffusion planner with a…
-
New research explores Q-learning stability and offline RL methods
Two new research papers explore advancements in reinforcement learning techniques. One paper introduces Drift Q-Learning, a method that combines a drift-based behavioral regularizer with critic-driven policy improvement…
-
New MoMa QL framework boosts RL efficiency with moment matching
Researchers have introduced Moment Matching Q-Learning (MoMa QL), a novel framework designed to address the inference latency issues in score-based and flow-based generative models used in reinforcement learning. MoMa Q…
-
New SPAR framework improves offline policy improvement in AI
Researchers have introduced Support-Preserving Action Rectification (SPAR), a novel framework designed to address the inherent conflict in offline policy improvement. SPAR reframes global learning as a local residual re…
-
New research advances policy optimization for robotics and LLMs
Researchers have introduced several new methods to enhance policy optimization in reinforcement learning, particularly for complex tasks involving robotics and large language models. MODIP aims to efficiently fine-tune …
-
New COOPO framework boosts reinforcement learning efficiency
Researchers have developed a new framework called COOPO (Cyclic Offline-Online Policy Optimization) to address limitations in offline and online reinforcement learning. This method repeatedly cycles between offline trai…
-
SlimDT paper proposes injecting RTG outside sequential modeling
Researchers have developed SlimDT, a modification of the Decision Transformer (DT) model for offline reinforcement learning. SlimDT removes the Return-to-Go (RTG) token from the autoregressive sequence, instead injectin…