ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

234

234 over 90d

Releases · 30d

0 over 90d

Papers · 30d

223

223 over 90d

TIER MIX · 90D

significant 2
research 87
tool 137
commentary 8

TOPICS

paper 223
other 118
model release 50
safety 42
product 37
infra 13
opinion 2
funding 2

RELATIONSHIPS

instance of SOFT ACTOR-CRITIC REINFORCEMENT LEARNING FOR ROBOTIC MANIPULATOR WITH HINDSIGHT EXPERIENCE REPLAY 95%
used by large-language models 90%
used by Grpo 90%
used by Markov decision process 90%
instance of Multi-agent reinforcement learning 90%
instance of Very Large Array 90%
used by large language model 90%
used by Soft Actor--Critic 90%
developed by large-language models 70%
developed by Grpo 70%
used by robotics 70%
used by supervised fine-tuning 70%

TIMELINE

2026-05-18 research_milestone A new paper proposes a reinforcement learning framework for modeling customer trajectories in retail. source

SENTIMENT · 30D

25 day(s) with sentiment data

RECENT · PAGE 10/10 · 200 TOTAL

RESEARCH · CL_20444 · May 5 · 04:00

ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

Researchers have developed ELVIS, a novel approach to long-horizon visual planning in reinforcement learning that uses a Gaussian-mixture model predictive controller to maintain multiple hypotheses over extended rollout…
TOOL · CL_16235 · May 5 · 04:00

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Researchers have developed a new framework called RAST-MoE-RL to improve efficiency in ride-hailing services. This framework utilizes a Mixture-of-Experts (MoE) approach within deep reinforcement learning to better hand…
TOOL · CL_16176 · May 5 · 04:00

AI research integrates reward shaping with control functions for safer UAV navigation

Researchers have developed a novel approach for Unmanned Aerial Vehicle (UAV) navigation that combines reinforcement learning with control Lyapunov and barrier functions. This method aims to improve both mission efficie…
TOOL · CL_16021 · May 5 · 04:00

Infoprop Dyna enables Mini Wheelbot to learn racing in 11 minutes

Researchers have developed a new reinforcement learning framework called Infoprop Dyna that allows robots to learn complex tasks directly from real-world interactions, bypassing the need for traditional physics-based si…
TOOL · CL_15707 · May 5 · 04:00

Researchers use RL to improve MLLM regression on imbalanced data

Researchers have developed a new framework to improve how multimodal large language models (MLLMs) handle numerical regression tasks, particularly those with imbalanced data distributions. Existing training methods ofte…
TOOL · CL_15642 · May 5 · 04:00

New Omni-Fake dataset benchmarks multimodal deepfake detection on social media

Researchers have introduced Omni-Fake, a new benchmark dataset designed to improve the detection of multimodal deepfakes on social media. The dataset includes over 1 million samples across image, audio, video, and audio…
RESEARCH · CL_14492 · May 4 · 04:00

New LEGIT dataset evaluates LLM legal reasoning with issue tree rubrics

Researchers have developed LEGIT, a new dataset containing 24,000 legal reasoning instances designed to evaluate the quality of LLM-generated legal arguments. This dataset converts court judgments into hierarchical tree…
RESEARCH · CL_14455 · May 4 · 04:00

AI framework optimizes resource-constrained outbreak control using hierarchical reinforcement learning

Researchers have developed a hierarchical reinforcement learning framework to optimize the allocation of limited resources for controlling infectious disease outbreaks across multiple clusters. This approach uses a glob…
TOOL · CL_24789 · May 3 · 09:00

UAV navigation enhanced with RL, safety functions

Researchers have developed a novel approach for autonomous UAV navigation that enhances both speed and safety. This method combines reinforcement learning with potential-based reward shaping, control Lyapunov functions,…
RESEARCH · CL_14204 · May 1 · 07:54

New research advances bandit algorithms for control, causality, and multi-objective learning

Multiple research papers explore advancements in bandit algorithms across various domains. One study introduces a machine learning framework for optimal control of fluid restless multi-armed bandit problems, achieving s…
RESEARCH · CL_11893 · May 1 · 04:00

AutoREC platform uses RL agents to generate circuit models from EIS data

Researchers have developed AutoREC, an open-source Python package designed to automate the generation of equivalent circuit models (ECMs) from electrochemical impedance spectroscopy (EIS) data. This platform utilizes re…
RESEARCH · CL_11773 · May 1 · 04:00

Transformer RL optimizes 6G network function chain partitioning

Researchers have developed a new Transformer-based actor-critic reinforcement learning framework to address the challenges of partitioning Service Function Chains (SFCs) in future 6G networks. This approach utilizes sel…
RESEARCH · CL_11758 · May 1 · 04:00

OpAgent achieves 71.6% success rate in web navigation tasks

Researchers have developed OpAgent, a novel web navigation agent that utilizes online reinforcement learning to overcome the limitations of static datasets. The agent employs a hierarchical multi-task fine-tuning approa…
RESEARCH · CL_11711 · May 1 · 04:00

AI game teaches cybersecurity defense through interactive Q&A

Researchers have developed a novel educational framework called the Explainable Q20 Cybersecurity Recommender (EQ-20CR) that uses a game-inspired approach to teach cybersecurity. The system employs a reinforcement learn…
RESEARCH · CL_11520 · Apr 30 · 16:48

FiLMMeD model uses Feature-wise Linear Modulation for multi-depot vehicle routing

Researchers have introduced FiLMMeD, a novel neural network model designed to tackle various multi-depot vehicle routing problems (MDVRP). This model enhances generalization by incorporating Feature-wise Linear Modulati…
RESEARCH · CL_11524 · Apr 30 · 15:48

New paper derives exponential family results from single KL identity

Researchers have identified a fundamental identity for exponential families, which are distributions crucial to modern machine learning techniques like softmax and Gaussian distributions. This identity simplifies the de…
RESEARCH · CL_11403 · Apr 30 · 15:27

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

Researchers have introduced Kernelized Advantage Estimation (KAE) to enhance the reasoning capabilities of large language models (LLMs) through reinforcement learning. KAE addresses limitations in existing methods like …
RESEARCH · CL_11380 · Apr 30 · 04:00

Surveys explore robot learning from human videos and world models, while new networks tackle driver monitoring.

Two new survey papers explore advancements in robot learning, focusing on different data acquisition and utilization strategies. One paper provides a comprehensive review of world models, which are predictive representa…
RESEARCH · CL_10233 · Apr 30 · 04:00

DORA system accelerates LLM reinforcement learning by 2-4x with novel asynchronous rollout

Researchers have developed DORA, a novel asynchronous reinforcement learning system designed to accelerate language model training. DORA addresses the bottleneck caused by long-tailed trajectories in the rollout phase b…
RESEARCH · CL_09868 · Apr 29 · 16:01

New UPSi filter enhances safety in reinforcement learning with uncertainty quantification

Researchers have developed the Uncertainty-Aware Predictive Safety Filter (UPSi), a novel approach to enhance safety during reinforcement learning exploration. UPSi integrates probabilistic ensemble neural networks with…

ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

AI research integrates reward shaping with control functions for safer UAV navigation

Infoprop Dyna enables Mini Wheelbot to learn racing in 11 minutes

Researchers use RL to improve MLLM regression on imbalanced data

New Omni-Fake dataset benchmarks multimodal deepfake detection on social media

New LEGIT dataset evaluates LLM legal reasoning with issue tree rubrics

AI framework optimizes resource-constrained outbreak control using hierarchical reinforcement learning

UAV navigation enhanced with RL, safety functions

New research advances bandit algorithms for control, causality, and multi-objective learning

AutoREC platform uses RL agents to generate circuit models from EIS data

Transformer RL optimizes 6G network function chain partitioning

OpAgent achieves 71.6% success rate in web navigation tasks

AI game teaches cybersecurity defense through interactive Q&A

FiLMMeD model uses Feature-wise Linear Modulation for multi-depot vehicle routing

New paper derives exponential family results from single KL identity

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

Surveys explore robot learning from human videos and world models, while new networks tackle driver monitoring.

DORA system accelerates LLM reinforcement learning by 2-4x with novel asynchronous rollout

New UPSi filter enhances safety in reinforcement learning with uncertainty quantification