Markov decision process
PulseAugur coverage of Markov decision process — every cluster mentioning Markov decision process across labs, papers, and developer communities, ranked by signal.
9 day(s) with sentiment data
-
New reinforcement learning method optimizes underwater vehicle power budgets
Researchers have developed a new method for controlling underwater vehicles that prioritizes energy efficiency by treating power consumption as an explicit constraint. This approach uses a constrained Markov decision pr…
-
New SciVerseGym environment standardizes AI-driven crystal discovery
Researchers have developed SciVerseGym, a new environment compatible with Gymnasium that frames crystal discovery as a Markov decision process. This platform allows agents to interact with atomistic structures, apply ed…
-
New POMDP Framework Optimizes Lithium Production Under Uncertainty
Researchers have developed a new framework using a partially observable Markov decision process (POMDP) to optimize lithium production decisions. This approach addresses uncertainties in geology, demand, and pricing, wh…
-
New Pareto Q-Learning algorithm enhances multi-objective reinforcement learning
Researchers have introduced Pareto Q-Learning with Reward Machines (PQLRM), a novel multi-objective reinforcement learning algorithm designed for tasks with complex reward structures defined by reward machines. This alg…
-
Researchers propose foundation models for reinforcement learning
A new research paper proposes the development of foundation models specifically for reinforcement learning (RL), arguing that this area is currently a conspicuous gap compared to language and vision. The authors suggest…
-
New Reversal Q-Learning Algorithm Boosts Offline RL Performance
Researchers have introduced Reversal Q-Learning (RQL), a novel off-policy reinforcement learning algorithm designed for offline RL tasks. RQL leverages iterative generative modeling techniques like flow matching to trai…
-
MagicSim infrastructure unifies robot learning, control, and planning
Researchers have introduced MagicSim, a new unified infrastructure designed for executable embodied interaction in robotics. This system aims to bridge the gap between robot control, skills, and planning by providing a …
-
New CADO framework optimizes combinatorial optimization solvers
Researchers have introduced CADO, a novel framework designed to improve heatmap-based solvers for combinatorial optimization problems. Unlike traditional supervised learning methods that focus on imitating data structur…
-
New RL method uses K-step lookahead for faster learning
Researchers have developed a novel approach to reinforcement learning in non-episodic, finite-horizon Markov decision processes (MDPs). The method introduces a modified Q-function that limits planning to a K-step lookah…
-
New framework models digital twins with parsimonious stochastic surrogates
Researchers have developed a new framework for creating parsimonious stochastic surrogate models for digital twin applications. This method identifies essential variables from observational data by focusing on those tha…
-
New Research Finds Monte Carlo Policy Iteration Fails with Nonuniform Updates
A new paper presents a certified counterexample to the convergence of Monte Carlo optimistic policy iteration when using nonuniform update frequencies. The research demonstrates that fixed nonuniform state-selection pro…
-
Reinforcement learning methods compared for optimizing business processes
A new research paper explores learning optimal policies for prescriptive process monitoring using reinforcement learning. The study compares a model-based approach using Markov Decision Processes (MDPs) with a model-fre…
-
New methods explore gradient-free optimization for neural networks
Researchers are exploring novel methods for optimizing neural networks without relying on traditional gradient-based approaches. One paper introduces a first-order layer for differentiable optimization that avoids compu…
-
New algorithm BLINQ learns Whittle indices for Markov Decision Processes
Researchers have developed BLINQ, a novel model-based algorithm designed to learn Whittle indices for Markov Decision Processes. This new approach constructs an empirical estimate of the MDP and then computes the indice…
-
Deep RL algorithms learn distinct representational invariances
Researchers have analyzed deep reinforcement learning representations using MDP reduction theory, finding that different algorithms learn distinct types of invariances. Specifically, DQN learns representations invariant…
-
LLMs use RL for adaptive sampling to cut costs
Researchers have developed a novel method for adaptive sampling in large language models (LLMs) that uses reinforcement learning (RL) to optimize performance. This approach formulates the sampling process as a Markov de…
-
New DRLHQ approach solves complex location-routing problems
Researchers have developed a novel end-to-end deep reinforcement learning approach called DRLHQ to tackle complex capacitated location-routing problems (CLRPs). This method, structured with an encoder-decoder framework,…
-
Review explores DRL for intelligent offloading in vehicular edge computing
This paper provides a comprehensive review of Deep Reinforcement Learning (DRL) approaches for intelligent offloading in vehicular edge computing (VEC). It categorizes existing research based on learning paradigms, syst…
-
New Q-value iteration analysis uses switching geometry
This paper introduces a new framework for analyzing Q-value iteration in Markov decision processes, focusing on a technique called rank-one deflation. The authors interpret the algorithm's behavior through the geometry …
-
New protocol optimizes drug trial subsidies to boost social utility
Researchers have developed a new statistical protocol for sequential experimentation that aims to optimize social utility in high-stakes domains like drug development. This protocol involves a product developer conducti…