ENTITY reinforcement learning

reinforcement learning

PulseAugur coverage of reinforcement learning — every cluster mentioning reinforcement learning across labs, papers, and developer communities, ranked by signal.

Total · 30d

119

119 over 90d

Releases · 30d

0 over 90d

Papers · 30d

115

115 over 90d

TIER MIX · 90D

significant 2
research 44
tool 71
commentary 1
meme 1

RELATIONSHIPS

SENTIMENT · 30D

7 day(s) with sentiment data

RECENT · PAGE 3/5 · 94 TOTAL

TOOL · CL_15707 · May 5 · 04:00

Researchers use RL to improve MLLM regression on imbalanced data

Researchers have developed a new framework to improve how multimodal large language models (MLLMs) handle numerical regression tasks, particularly those with imbalanced data distributions. Existing training methods ofte…
TOOL · CL_15642 · May 5 · 04:00

New Omni-Fake dataset benchmarks multimodal deepfake detection on social media

Researchers have introduced Omni-Fake, a new benchmark dataset designed to improve the detection of multimodal deepfakes on social media. The dataset includes over 1 million samples across image, audio, video, and audio…
TOOL · CL_16021 · May 5 · 04:00

Infoprop Dyna enables Mini Wheelbot to learn racing in 11 minutes

Researchers have developed a new reinforcement learning framework called Infoprop Dyna that allows robots to learn complex tasks directly from real-world interactions, bypassing the need for traditional physics-based si…
TOOL · CL_16176 · May 5 · 04:00

AI research integrates reward shaping with control functions for safer UAV navigation

Researchers have developed a novel approach for Unmanned Aerial Vehicle (UAV) navigation that combines reinforcement learning with control Lyapunov and barrier functions. This method aims to improve both mission efficie…
TOOL · CL_16235 · May 5 · 04:00

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

Researchers have developed a new framework called RAST-MoE-RL to improve efficiency in ride-hailing services. This framework utilizes a Mixture-of-Experts (MoE) approach within deep reinforcement learning to better hand…
RESEARCH · CL_20444 · May 5 · 04:00

ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

Researchers have developed ELVIS, a novel approach to long-horizon visual planning in reinforcement learning that uses a Gaussian-mixture model predictive controller to maintain multiple hypotheses over extended rollout…
RESEARCH · CL_18799 · May 5 · 03:36

New DGPO framework improves LLM reasoning credit assignment

Researchers have introduced Distribution Guided Policy Optimization (DGPO), a new reinforcement learning framework designed to improve how large language models handle complex reasoning tasks. Current methods struggle w…
RESEARCH · CL_14455 · May 4 · 04:00

AI framework optimizes resource-constrained outbreak control using hierarchical reinforcement learning

Researchers have developed a hierarchical reinforcement learning framework to optimize the allocation of limited resources for controlling infectious disease outbreaks across multiple clusters. This approach uses a glob…
RESEARCH · CL_14492 · May 4 · 04:00

New LEGIT dataset evaluates LLM legal reasoning with issue tree rubrics

Researchers have developed LEGIT, a new dataset containing 24,000 legal reasoning instances designed to evaluate the quality of LLM-generated legal arguments. This dataset converts court judgments into hierarchical tree…
TOOL · CL_24789 · May 3 · 09:00

UAV navigation enhanced with RL, safety functions

Researchers have developed a novel approach for autonomous UAV navigation that enhances both speed and safety. This method combines reinforcement learning with potential-based reward shaping, control Lyapunov functions,…
RESEARCH · CL_14204 · May 1 · 13:20

New research advances bandit algorithms for control, causality, and multi-objective learning

Multiple research papers explore advancements in bandit algorithms across various domains. One study introduces a machine learning framework for optimal control of fluid restless multi-armed bandit problems, achieving s…
RESEARCH · CL_11711 · May 1 · 04:00

AI game teaches cybersecurity defense through interactive Q&A

Researchers have developed a novel educational framework called the Explainable Q20 Cybersecurity Recommender (EQ-20CR) that uses a game-inspired approach to teach cybersecurity. The system employs a reinforcement learn…
RESEARCH · CL_11893 · May 1 · 04:00

AutoREC platform uses RL agents to generate circuit models from EIS data

Researchers have developed AutoREC, an open-source Python package designed to automate the generation of equivalent circuit models (ECMs) from electrochemical impedance spectroscopy (EIS) data. This platform utilizes re…
RESEARCH · CL_11758 · May 1 · 04:00

OpAgent achieves 71.6% success rate in web navigation tasks

Researchers have developed OpAgent, a novel web navigation agent that utilizes online reinforcement learning to overcome the limitations of static datasets. The agent employs a hierarchical multi-task fine-tuning approa…
RESEARCH · CL_11773 · May 1 · 04:00

Transformer RL optimizes 6G network function chain partitioning

Researchers have developed a new Transformer-based actor-critic reinforcement learning framework to address the challenges of partitioning Service Function Chains (SFCs) in future 6G networks. This approach utilizes sel…
RESEARCH · CL_11520 · Apr 30 · 16:48

FiLMMeD model uses Feature-wise Linear Modulation for multi-depot vehicle routing

Researchers have introduced FiLMMeD, a novel neural network model designed to tackle various multi-depot vehicle routing problems (MDVRP). This model enhances generalization by incorporating Feature-wise Linear Modulati…
RESEARCH · CL_11524 · Apr 30 · 15:48

New paper derives exponential family results from single KL identity

Researchers have identified a fundamental identity for exponential families, which are distributions crucial to modern machine learning techniques like softmax and Gaussian distributions. This identity simplifies the de…
RESEARCH · CL_11403 · Apr 30 · 15:27

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

Researchers have introduced Kernelized Advantage Estimation (KAE) to enhance the reasoning capabilities of large language models (LLMs) through reinforcement learning. KAE addresses limitations in existing methods like …
RESEARCH · CL_10233 · Apr 30 · 04:00

DORA system accelerates LLM reinforcement learning by 2-4x with novel asynchronous rollout

Researchers have developed DORA, a novel asynchronous reinforcement learning system designed to accelerate language model training. DORA addresses the bottleneck caused by long-tailed trajectories in the rollout phase b…
RESEARCH · CL_11380 · Apr 30 · 04:00

Surveys explore robot learning from human videos and world models, while new networks tackle driver monitoring.

Two new survey papers explore advancements in robot learning, focusing on different data acquisition and utilization strategies. One paper provides a comprehensive review of world models, which are predictive representa…

Researchers use RL to improve MLLM regression on imbalanced data

New Omni-Fake dataset benchmarks multimodal deepfake detection on social media

Infoprop Dyna enables Mini Wheelbot to learn racing in 11 minutes

AI research integrates reward shaping with control functions for safer UAV navigation

RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

ELVIS: Ensemble-Calibrated Latent Imagination for Long-Horizon Visual MPC

New DGPO framework improves LLM reasoning credit assignment

AI framework optimizes resource-constrained outbreak control using hierarchical reinforcement learning

New LEGIT dataset evaluates LLM legal reasoning with issue tree rubrics

UAV navigation enhanced with RL, safety functions

New research advances bandit algorithms for control, causality, and multi-objective learning

AI game teaches cybersecurity defense through interactive Q&A

AutoREC platform uses RL agents to generate circuit models from EIS data

OpAgent achieves 71.6% success rate in web navigation tasks

Transformer RL optimizes 6G network function chain partitioning

FiLMMeD model uses Feature-wise Linear Modulation for multi-depot vehicle routing

New paper derives exponential family results from single KL identity

New Kernelized Advantage Estimation improves LLM reasoning with nonparametric statistics

DORA system accelerates LLM reinforcement learning by 2-4x with novel asynchronous rollout

Surveys explore robot learning from human videos and world models, while new networks tackle driver monitoring.