ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

215

215 over 90d

Releases · 30d

0 over 90d

Papers · 30d

204

204 over 90d

TIER MIX · 90D

significant 2
research 81
tool 124
commentary 8

TOPICS

paper 204
other 112
model release 45
safety 39
product 31
infra 10
opinion 2
funding 2

RELATIONSHIPS

instance of SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY 95%
used by large-language models 90%
used by Grpo 90%
used by Markov decision process 90%
used by large language model 90%
used by Soft Actor--Critic 90%
developed by large-language models 70%
developed by Grpo 70%
used by robotics 70%
used by supervised fine-tuning 70%
used by Group Relative Policy Optimization 70%
employs Diffusion Models 70%

TIMELINE

2026-05-18 research_milestone A new paper proposes a reinforcement learning framework for modeling customer trajectories in retail. source

SENTIMENT · 30D

26 day(s) with sentiment data

RECENT · PAGE 4/10 · 200 TOTAL

TOOL · CL_62163 · May 29 · 08:36

Graph-GRPO enhances e-commerce search relevance with LLMs

Researchers have developed Graph-GRPO, a novel framework for improving e-commerce search relevance by leveraging large language models and reinforcement learning. This method constructs a dependency graph of reasoning s…
COMMENTARY · CL_56939 · May 28 · 11:22

AI explores adaptive control systems in everyday tech

This cluster discusses adaptive control systems within AI, posing a question about which everyday systems best adapt to change. It highlights robotics, reinforcement learning, electrical engineering, and feedback loops …
TOOL · CL_56837 · May 28 · 10:15

Apertus LLM team seeks AI research engineers in Switzerland

The Apertus LLM team is seeking AI research engineers to join their FOSS initiative in Lausanne, Switzerland. Ideal candidates will have experience in software, data, and ML engineering, with a specific interest in post…
COMMENTARY · CL_53375 · May 27 · 00:54

Fireworks AI details complex RL infrastructure for continuous model updates

Fireworks AI is detailing the engineering challenges and solutions involved in training large language models, particularly focusing on Reinforcement Learning (RL) from human feedback. They highlight that a product's re…
RESEARCH · CL_56333 · May 27 · 00:00

New method GUI-CIDER boosts GUI agent knowledge

Researchers have developed GUI-CIDER, a novel mid-training method designed to enhance the world knowledge of GUI agents built with multimodal large language models. This approach explicitly internalizes GUI operational …
TOOL · CL_53259 · May 26 · 23:05

Reinforcement learning math series explains core agent reasoning tools

Shawn Hymel's latest post in his Reinforcement Learning math series explains key concepts like expected return, state value function (v(s)), and action-value function (q(s,a)). These mathematical tools are fundamental f…
RESEARCH · CL_52122 · May 26 · 10:29

Reinforcement learning pioneer partners with Chinese firm on 'Robot Kindergarten'

Richard Sutton, a pioneer in reinforcement learning, has partnered with Chinese haptic technology company HeShan Technology to launch a "Robot Kindergarten" project. This initiative aims to train embodied AI agents thro…
TOOL · CL_51474 · May 26 · 04:00

New RL method improves transfer learning with Bellman alignment

Researchers have introduced a new method called One-Step Bellman Alignment (RWT) to improve transfer learning in online reinforcement learning. This technique addresses the challenge of using data from related source ta…
TOOL · CL_51430 · May 26 · 04:00

Soft synthetic snakes learn to navigate complex 3D terrains

Researchers have developed a computational framework enabling soft synthetic snakes to navigate complex 3D terrains. The system uses bio-inspired actuation and sensing models to simplify control for these high-degree-of…
TOOL · CL_51375 · May 26 · 04:00

New RL framework optimizes laser manufacturing scan orders

Researchers have developed a new framework to improve reinforcement learning for optimizing scan orders in laser additive manufacturing. This bilevel Proxy--FEA diagnostic approach uses lightweight proxies for rapid can…
TOOL · CL_51101 · May 26 · 04:00

ResDreamer model enhances RL agents with hierarchical visual reasoning

Researchers have developed ResDreamer, a novel hierarchical world model designed to improve reinforcement learning in complex 3D environments. This self-supervised approach trains layers to reconstruct residuals of the …
TOOL · CL_51065 · May 26 · 04:00

New ERPD method enhances LLM reinforcement learning

Researchers have developed Extreme Region Policy Distillation (ERPD), a novel two-stage framework for reinforcement learning in large language models. This method aims to overcome the trade-off between sample efficiency…
TOOL · CL_50989 · May 26 · 04:00

New CEDGE framework uses diffusion models for off-dynamics reinforcement learning

Researchers have developed CEDGE, a novel framework for off-dynamics reinforcement learning that utilizes diffusion models to generate synthetic trajectories. This approach trains a diffusion model on source-domain data…
RESEARCH · CL_50951 · May 26 · 04:00

AI Research Advances Policy Optimization for LLMs and Robotics

Researchers are developing new methods to improve policy optimization in reinforcement learning, particularly for large language models and robotics. Techniques like Physics-Guided Policy Optimization (PGPO) and Hint-Gu…
TOOL · CL_50843 · May 26 · 04:00

Reinforcement learning optimizes EV charging for lower emissions

Researchers have developed a new emission-aware reinforcement learning strategy to optimize electric vehicle charging. This approach, based on the Soft Actor Critic algorithm, prioritizes reducing carbon emissions and m…
TOOL · CL_50830 · May 26 · 04:00

AI research paper advocates for enactive perception and embodied interaction

This paper proposes integrating enactive approaches into artificial intelligence, viewing perception as an active, embodied engagement with the environment rather than passive input processing. It highlights four key en…
TOOL · CL_50826 · May 26 · 04:00

Neuro-inspired Inverter framework enhances AI planning and control

Researchers have developed a novel neuro-inspired framework called Inverter for embodied planning and control. This framework utilizes Inverse Learning (IL) to train components, bridging the gap between reinforcement le…
TOOL · CL_50800 · May 26 · 04:00

Quantum Frog game shows cooperation improves agent success

Researchers have developed a new cooperative game called Quantum Frog, inspired by Frogger, which uses a quantized-time mechanic where the environment only advances when a player acts. Using reinforcement learning, they…
COMMENTARY · CL_50745 · May 26 · 03:05

AI Safety expert critiques Bengio's 'Scientist AI' plan

A critique of Yoshua Bengio's "Scientist AI" proposal raises concerns about its alignment failures and practical feasibility. The author argues that preventing the AI from exploring agentically, a key aspect of scientif…
COMMENTARY · CL_48044 · May 25 · 03:01

Fireworks AI: Frontier RL infrastructure costs are lower than believed

Fireworks AI argues that the conventional wisdom regarding the cost of frontier Reinforcement Learning (RL) infrastructure is flawed. They propose that instead of transferring entire multi-terabyte model checkpoints for…

Graph-GRPO enhances e-commerce search relevance with LLMs

AI explores adaptive control systems in everyday tech

Apertus LLM team seeks AI research engineers in Switzerland

Fireworks AI details complex RL infrastructure for continuous model updates

New method GUI-CIDER boosts GUI agent knowledge

Reinforcement learning math series explains core agent reasoning tools

Reinforcement learning pioneer partners with Chinese firm on 'Robot Kindergarten'

New RL method improves transfer learning with Bellman alignment

Soft synthetic snakes learn to navigate complex 3D terrains

New RL framework optimizes laser manufacturing scan orders

ResDreamer model enhances RL agents with hierarchical visual reasoning

New ERPD method enhances LLM reinforcement learning

New CEDGE framework uses diffusion models for off-dynamics reinforcement learning

AI Research Advances Policy Optimization for LLMs and Robotics

Reinforcement learning optimizes EV charging for lower emissions

AI research paper advocates for enactive perception and embodied interaction

Neuro-inspired Inverter framework enhances AI planning and control

Quantum Frog game shows cooperation improves agent success

AI Safety expert critiques Bengio's 'Scientist AI' plan

Fireworks AI: Frontier RL infrastructure costs are lower than believed