ENTITY Markov decision processes: a tool for sequential decision making under uncertainty

Markov decision processes: a tool for sequential decision making under uncertainty

PulseAugur coverage of Markov decision processes: a tool for sequential decision making under uncertainty — every cluster mentioning Markov decision processes: a tool for sequential decision making under uncertainty across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

30 over 90d

Releases · 30d

0 over 90d

Papers · 30d

29 over 90d

TIER MIX · 90D

research 14
tool 15
commentary 1

TOPICS

paper 29
other 24
safety 3
model release 3
product 1

SENTIMENT · 30D

12 day(s) with sentiment data

RECENT · PAGE 1/2 · 30 TOTAL

TOOL · CL_111775 · Jun 26 · 04:00

AI policies learn cybersecurity penetration testing faster with history aggregation

Researchers have developed and evaluated reinforcement learning policies for penetration testing in cybersecurity scenarios with partial observability. They compared several Proximal Policy Optimization (PPO) variants, …
TOOL · CL_109972 · Jun 25 · 04:00

New MPC approach integrates future information for optimal decision-making

Researchers have developed a method to integrate future information into Model Predictive Control (MPC) for solving Markov Decision Processes (MDPs). This approach allows MPC, which is typically used for constraint enfo…
TOOL · CL_109526 · Jun 24 · 13:16

New confidence sequences improve online statistical model checking for MDPs

Researchers have developed new confidence sequences for online statistical model checking of Markov decision processes (MDPs). These sequences aim to provide more accurate and efficient guarantees when exact probabiliti…
RESEARCH · CL_109541 · Jun 24 · 09:00

New research simplifies optimal policies in Markov decision processes

Researchers have developed a new approach to understanding optimal policies in structured Markov decision processes. The study proposes boundary-based policy approximations that directly learn policy regions, contrastin…
RESEARCH · CL_109497 · Jun 23 · 21:02

New minimax PAC bounds for learning in exogenous contextual MDPs

Researchers have developed new minimax PAC bounds for learning in exogenous contextual Markov decision processes (MDPs). The study focuses on tabular discounted MDPs with exogenous, i.i.d. contexts that can influence re…
TOOL · CL_100098 · Jun 19 · 04:00

In-context learning may enable intrinsic curiosity in machine learning

A new research paper explores whether in-context learning (ICL) capabilities of large sequence models can support intrinsic curiosity in machine learning. The study investigates if an exploration policy can be trained t…
RESEARCH · CL_99557 · Jun 18 · 13:19

New OPE method tackles missing rewards in reinforcement learning

Researchers have developed a new method for off-policy evaluation (OPE) in reinforcement learning when rewards are missing not at random (MNAR). This approach addresses selection bias by using future states as shadow va…
RESEARCH · CL_99689 · Jun 18 · 11:30

New research explores robust optimization and reinforcement learning techniques · 6 sources tracked

Several new research papers explore advanced techniques in reinforcement learning and optimization, focusing on robustness and generative models. One paper introduces a stationary robust mean-field game framework to add…
TOOL · CL_104022 · Jun 17 · 18:11

In-Context Learning Explored for AI Intrinsic Curiosity

Researchers have explored whether in-context learning (ICL) capabilities of sequence models can support intrinsic curiosity in machine learning. While traditional methods for automated data selection, or "intrinsic curi…
RESEARCH · CL_98174 · Jun 17 · 14:00

AI model optimizes Type 2 Diabetes follow-up intervals, reducing costs

Researchers have developed a Contextual Markov Decision Process (CMDP) model to optimize follow-up intervals for Type 2 Diabetes (T2D) patients, moving beyond the American Diabetes Association's fixed guidelines. By ana…
TOOL · CL_96221 · Jun 17 · 04:00

New AI Framework Optimizes Decision-Making in Complex Environments

Researchers have developed a new method for creating performance-driven environment abstractions in large Markov decision processes. This approach focuses on optimizing decision quality by aggregating states and enforci…
TOOL · CL_93859 · Jun 16 · 04:00

New Q-Learning Algorithms Offer Fine-Grained Regret Bounds

Researchers have developed new algorithms for Q-learning that provide more precise regret bounds in episodic tabular Markov Decision Processes. These advancements address limitations in existing methods by offering fine…
COMMENTARY · CL_88317 · Jun 12 · 23:10

ReAct Pattern Enhances LLM Reasoning and Action Capabilities

The ReAct Pattern is a design pattern for Large Language Models (LLMs) that enhances their reasoning and action capabilities in complex environments. It enables LLMs to perceive, reason, and act, allowing them to learn …
RESEARCH · CL_90807 · Jun 12 · 04:19

Lyapunov Framework Enhances Learning in Weakly-Coupled MDPs

Researchers have developed a novel Lyapunov-based framework to analyze the sample complexity of learning in weakly-coupled Markov decision processes (WCMDPs) and Restless Bandits (RBs). This approach offers a more effic…
RESEARCH · CL_82419 · Jun 9 · 15:15

New framework simplifies DRL for complex, state-dependent actions

Researchers have introduced a new framework called Bellman-Taylor score decoding to address challenges in applying deep reinforcement learning to Markov decision processes with complex, state-dependent actions. This met…
TOOL · CL_65914 · Jun 2 · 04:00

New Tangle-Core Abstraction Improves Reinforcement Learning

Researchers have developed a new method for state abstraction in Markov Decision Processes called tangle-core abstraction. This approach uses graph tangles to create overlapping abstract states, which is particularly us…
TOOL · CL_65340 · Jun 2 · 04:00

AI research links optimal control to prospect-theory behavior

A new research paper explores how optimal control in Markov decision processes (MDPs) can inherently lead to prospect-theory-like behaviors, even without explicit utility curvature or probability weighting. The study id…
RESEARCH · CL_68129 · Jun 2 · 02:30

POMDP value functions characterized as semi-algebraic sets

Researchers have characterized the feasible set of value functions in partially observable Markov decision processes (POMDPs) as a semi-algebraic set. This extends previous work on fully observable processes, revealing …
TOOL · CL_62645 · Jun 1 · 04:00

New framework offers optimal sequential testing for Markovian data

Researchers have developed a new framework for sequential hypothesis testing specifically designed for data generated by Markov chains. This framework establishes a non-asymptotic lower bound on the expected stopping ti…
TOOL · CL_58791 · May 29 · 04:00

Semantic Segmentation Enhances RL Agents in 3D ViZDoom Environments

Researchers have developed new input representations for reinforcement learning agents operating in 3D environments, specifically within the ViZDoom game. By employing semantic segmentation on RGB images, the proposed m…

AI policies learn cybersecurity penetration testing faster with history aggregation

New MPC approach integrates future information for optimal decision-making

New confidence sequences improve online statistical model checking for MDPs

New research simplifies optimal policies in Markov decision processes

New minimax PAC bounds for learning in exogenous contextual MDPs

In-context learning may enable intrinsic curiosity in machine learning

New OPE method tackles missing rewards in reinforcement learning

New research explores robust optimization and reinforcement learning techniques · 6 sources tracked

In-Context Learning Explored for AI Intrinsic Curiosity

AI model optimizes Type 2 Diabetes follow-up intervals, reducing costs

New AI Framework Optimizes Decision-Making in Complex Environments

New Q-Learning Algorithms Offer Fine-Grained Regret Bounds

ReAct Pattern Enhances LLM Reasoning and Action Capabilities

Lyapunov Framework Enhances Learning in Weakly-Coupled MDPs

New framework simplifies DRL for complex, state-dependent actions

New Tangle-Core Abstraction Improves Reinforcement Learning

AI research links optimal control to prospect-theory behavior

POMDP value functions characterized as semi-algebraic sets

New framework offers optimal sequential testing for Markovian data

Semantic Segmentation Enhances RL Agents in 3D ViZDoom Environments