ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

401

401 over 90d

Releases · 30d

0 over 90d

Papers · 30d

380

380 over 90d

TIER MIX · 90D

significant 2
research 164
tool 224
commentary 11

TOPICS

paper 380
other 165
model release 121
safety 63
product 62
infra 42
opinion 2
funding 2

RELATIONSHIPS

instance of Q-learning 95%
instance of SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY 95%
instance of Markov decision process 90%
used by large language model 90%
instance of Multi-agent reinforcement learning 90%
developed Vision-language-action model 90%
instance of Very Large Array 90%
instance of AIME24 90%
used by Soft Actor--Critic 90%
used by autonomous driving 90%
used by unmanned aerial vehicle 90%
used by Education 90%

TIMELINE

2026-05-18 research_milestone A new paper proposes a reinforcement learning framework for modeling customer trajectories in retail. source

SENTIMENT · 30D

26 day(s) with sentiment data

RECENT · PAGE 1/10 · 200 TOTAL

TOOL · CL_114506 · Jun 28 · 11:20

Qwen-AgentWorld trains language model as RL agent simulator

Researchers have introduced Qwen-AgentWorld, a novel approach that trains a language model to function as a world model for reinforcement learning (RL) agents. This model predicts the next environment state based on the…
TOOL · CL_112584 · Jun 26 · 16:29

Fireworks AI hosts hackathon showcasing reinforcement learning applications

Fireworks AI, in collaboration with hud_evals and Y Combinator, hosted the inaugural RSI RL Environments hackathon. The event showcased the versatility of reinforcement learning, with participants applying it to diverse…
RESEARCH · CL_112642 · Jun 26 · 15:34

AI alignment research tackles reward hacking with new techniques

Researchers are exploring methods to prevent AI models from exploiting reward functions, a phenomenon known as reward hacking. One approach involves using steering vectors to guide gradient routing, aiming to isolate un…
TOOL · CL_111780 · Jun 26 · 04:00

New JAX framework accelerates RL for penetration testing

Researchers have developed NASimJax, a new JAX-based framework designed to accelerate reinforcement learning (RL) for penetration testing. This framework significantly enhances the speed of existing simulators, enabling…
TOOL · CL_111774 · Jun 26 · 04:00

Normalizing Flows Prove Capable for Continuous Control in RL

Researchers have demonstrated that normalizing flows (NFs) are capable models for continuous control tasks in reinforcement learning (RL). Contrary to the prevailing belief that NFs lack sufficient expressivity, this pa…
TOOL · CL_111763 · Jun 26 · 04:00

Reinforcement learning enables autonomous microrobot navigation in blood capillaries

Researchers have developed a new simulation environment for blood capillaries that incorporates realistic hydrodynamic flow fields, red blood cell dynamics, and anatomical geometry. Using this simulation, they trained d…
TOOL · CL_111689 · Jun 26 · 04:00

Robots learn to play for better precision assembly

Researchers have developed a new framework called Play2Perfect to improve the dexterity of multi-fingered robots for precise assembly tasks. This framework focuses on pretraining robots through diverse play-based manipu…
RESEARCH · CL_111228 · Jun 25 · 14:48

New Heavy-Ball Q-Learning method promises faster reinforcement learning convergence

Researchers have introduced a novel Heavy-Ball Q-Learning method designed to enhance reinforcement learning algorithms. This new approach establishes convergence guarantees and identifies conditions under which it can t…
TOOL · CL_109972 · Jun 25 · 04:00

New MPC approach integrates future information for optimal decision-making

Researchers have developed a method to integrate future information into Model Predictive Control (MPC) for solving Markov Decision Processes (MDPs). This approach allows MPC, which is typically used for constraint enfo…
TOOL · CL_109939 · Jun 25 · 04:00

Uncertainty-aware RL enhances chemical language models for drug design

Researchers have developed novel methods to incorporate predictive uncertainty into reinforcement learning for chemical language models (CLMs). These approaches aim to improve the de novo design of molecules by guiding …
TOOL · CL_109894 · Jun 25 · 04:00

Reinforcement learning enhances bearing health monitoring with adaptive sim-to-real alignment

Researchers have developed a novel approach for improving the accuracy of vibration-based bearing health monitoring, particularly in scenarios with limited fault data. Their method utilizes reinforcement learning to ada…
TOOL · CL_109001 · Jun 24 · 19:07

New open-source simulator MuJoFil targets high-fidelity vision RL training

A new open-source simulator called MuJoFil has been developed, aiming to address limitations in existing tools like MuJoCo for high-fidelity vision reinforcement learning (RL) training. MuJoFil combines Nvidia's GPU-nat…
TOOL · CL_109508 · Jun 24 · 16:23

New FORCE framework boosts VLA model RL fine-tuning efficiency

Researchers have developed FORCE, a novel three-stage framework designed to improve the efficiency and stability of Reinforcement Learning (RL) fine-tuning for Vision-Language-Action (VLA) models. This approach addresse…
TOOL · CL_108613 · Jun 24 · 11:16

AI alignment research defines 'reward hacking' in reinforcement learning

This item discusses the concept of "reward hacking" within reinforcement learning and AI alignment. It poses a question about achieving a target only to find the outcome was incorrect, linking this to Goodhart's Law. Th…
TOOL · CL_109534 · Jun 24 · 10:48

New reinforcement learning method optimizes underwater vehicle power budgets

Researchers have developed a new method for controlling underwater vehicles that prioritizes energy efficiency by treating power consumption as an explicit constraint. This approach uses a constrained Markov decision pr…
RESEARCH · CL_109541 · Jun 24 · 09:00

New research simplifies optimal policies in Markov decision processes

Researchers have developed a new approach to understanding optimal policies in structured Markov decision processes. The study proposes boundary-based policy approximations that directly learn policy regions, contrastin…
RESEARCH · CL_109549 · Jun 24 · 06:26

New SR-PPO method improves RL for language models with single rollout

Researchers have developed a new method called Single-Rollout Proximal Policy Optimization (SR-PPO) to address the challenges of estimating token-level advantages in reinforcement learning for language models. This appr…
RESEARCH · CL_109570 · Jun 24 · 06:26

New Generalization Spectrum framework evaluates AI learning transfer

Researchers have introduced the Generalization Spectrum, a novel evaluation framework designed to assess how far a learning algorithm's knowledge can transfer beyond its training data. This approach moves beyond traditi…
TOOL · CL_108064 · Jun 24 · 04:00

New RL Framework PEARL Enhances LLM Calendar Conflict Resolution

Researchers have developed PEARL, a reinforcement learning framework designed to improve the ability of large language models (LLMs) to manage calendar conflicts. Current LLM agents struggle with this task, exhibiting h…
TOOL · CL_108045 · Jun 24 · 04:00

MyoInteract framework speeds up HCI biomechanics research with RL

Researchers have developed MyoInteract, a new framework designed to accelerate the prototyping of biomechanical human-computer interaction (HCI) tasks using reinforcement learning (RL). This framework significantly redu…

Qwen-AgentWorld trains language model as RL agent simulator

Fireworks AI hosts hackathon showcasing reinforcement learning applications

AI alignment research tackles reward hacking with new techniques

New JAX framework accelerates RL for penetration testing

Normalizing Flows Prove Capable for Continuous Control in RL

Reinforcement learning enables autonomous microrobot navigation in blood capillaries

Robots learn to play for better precision assembly

New Heavy-Ball Q-Learning method promises faster reinforcement learning convergence

New MPC approach integrates future information for optimal decision-making

Uncertainty-aware RL enhances chemical language models for drug design

Reinforcement learning enhances bearing health monitoring with adaptive sim-to-real alignment

New open-source simulator MuJoFil targets high-fidelity vision RL training

New FORCE framework boosts VLA model RL fine-tuning efficiency

AI alignment research defines 'reward hacking' in reinforcement learning

New reinforcement learning method optimizes underwater vehicle power budgets

New research simplifies optimal policies in Markov decision processes

New SR-PPO method improves RL for language models with single rollout

New Generalization Spectrum framework evaluates AI learning transfer

New RL Framework PEARL Enhances LLM Calendar Conflict Resolution

MyoInteract framework speeds up HCI biomechanics research with RL