reinforcement learning
PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.
- instance of Q-learning 95%
- instance of SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY 95%
- instance of Markov decision process 90%
- used by large language model 90%
- instance of Multi-agent reinforcement learning 90%
- developed Vision-language-action model 90%
- instance of Very Large Array 90%
- instance of AIME24 90%
- used by Soft Actor--Critic 90%
- used by autonomous driving 90%
- used by unmanned aerial vehicle 90%
- used by Education 90%
- 2026-05-18 research_milestone A new paper proposes a reinforcement learning framework for modeling customer trajectories in retail. source
26 day(s) with sentiment data
-
Qwen-AgentWorld trains language model as RL agent simulator
Researchers have introduced Qwen-AgentWorld, a novel approach that trains a language model to function as a world model for reinforcement learning (RL) agents. This model predicts the next environment state based on the…
-
Fireworks AI hosts hackathon showcasing reinforcement learning applications
Fireworks AI, in collaboration with hud_evals and Y Combinator, hosted the inaugural RSI RL Environments hackathon. The event showcased the versatility of reinforcement learning, with participants applying it to diverse…
-
AI alignment research tackles reward hacking with new techniques
Researchers are exploring methods to prevent AI models from exploiting reward functions, a phenomenon known as reward hacking. One approach involves using steering vectors to guide gradient routing, aiming to isolate un…
-
New JAX framework accelerates RL for penetration testing
Researchers have developed NASimJax, a new JAX-based framework designed to accelerate reinforcement learning (RL) for penetration testing. This framework significantly enhances the speed of existing simulators, enabling…
-
Normalizing Flows Prove Capable for Continuous Control in RL
Researchers have demonstrated that normalizing flows (NFs) are capable models for continuous control tasks in reinforcement learning (RL). Contrary to the prevailing belief that NFs lack sufficient expressivity, this pa…
-
Reinforcement learning enables autonomous microrobot navigation in blood capillaries
Researchers have developed a new simulation environment for blood capillaries that incorporates realistic hydrodynamic flow fields, red blood cell dynamics, and anatomical geometry. Using this simulation, they trained d…
-
Robots learn to play for better precision assembly
Researchers have developed a new framework called Play2Perfect to improve the dexterity of multi-fingered robots for precise assembly tasks. This framework focuses on pretraining robots through diverse play-based manipu…
-
New Heavy-Ball Q-Learning method promises faster reinforcement learning convergence
Researchers have introduced a novel Heavy-Ball Q-Learning method designed to enhance reinforcement learning algorithms. This new approach establishes convergence guarantees and identifies conditions under which it can t…
-
New MPC approach integrates future information for optimal decision-making
Researchers have developed a method to integrate future information into Model Predictive Control (MPC) for solving Markov Decision Processes (MDPs). This approach allows MPC, which is typically used for constraint enfo…
-
Uncertainty-aware RL enhances chemical language models for drug design
Researchers have developed novel methods to incorporate predictive uncertainty into reinforcement learning for chemical language models (CLMs). These approaches aim to improve the de novo design of molecules by guiding …
-
Reinforcement learning enhances bearing health monitoring with adaptive sim-to-real alignment
Researchers have developed a novel approach for improving the accuracy of vibration-based bearing health monitoring, particularly in scenarios with limited fault data. Their method utilizes reinforcement learning to ada…
-
New open-source simulator MuJoFil targets high-fidelity vision RL training
A new open-source simulator called MuJoFil has been developed, aiming to address limitations in existing tools like MuJoCo for high-fidelity vision reinforcement learning (RL) training. MuJoFil combines Nvidia's GPU-nat…
-
New FORCE framework boosts VLA model RL fine-tuning efficiency
Researchers have developed FORCE, a novel three-stage framework designed to improve the efficiency and stability of Reinforcement Learning (RL) fine-tuning for Vision-Language-Action (VLA) models. This approach addresse…
-
AI alignment research defines 'reward hacking' in reinforcement learning
This item discusses the concept of "reward hacking" within reinforcement learning and AI alignment. It poses a question about achieving a target only to find the outcome was incorrect, linking this to Goodhart's Law. Th…
-
New reinforcement learning method optimizes underwater vehicle power budgets
Researchers have developed a new method for controlling underwater vehicles that prioritizes energy efficiency by treating power consumption as an explicit constraint. This approach uses a constrained Markov decision pr…
-
New research simplifies optimal policies in Markov decision processes
Researchers have developed a new approach to understanding optimal policies in structured Markov decision processes. The study proposes boundary-based policy approximations that directly learn policy regions, contrastin…
-
New SR-PPO method improves RL for language models with single rollout
Researchers have developed a new method called Single-Rollout Proximal Policy Optimization (SR-PPO) to address the challenges of estimating token-level advantages in reinforcement learning for language models. This appr…
-
New Generalization Spectrum framework evaluates AI learning transfer
Researchers have introduced the Generalization Spectrum, a novel evaluation framework designed to assess how far a learning algorithm's knowledge can transfer beyond its training data. This approach moves beyond traditi…
-
New RL Framework PEARL Enhances LLM Calendar Conflict Resolution
Researchers have developed PEARL, a reinforcement learning framework designed to improve the ability of large language models (LLMs) to manage calendar conflicts. Current LLM agents struggle with this task, exhibiting h…
-
MyoInteract framework speeds up HCI biomechanics research with RL
Researchers have developed MyoInteract, a new framework designed to accelerate the prototyping of biomechanical human-computer interaction (HCI) tasks using reinforcement learning (RL). This framework significantly redu…