PulseAugur
LIVE 01:50:17
ENTITY Markov decision process

Markov decision process

PulseAugur coverage of Markov decision process — every cluster mentioning Markov decision process across labs, papers, and developer communities, ranked by signal.

Total · 30d
22
22 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
22
22 over 90d
TIER MIX · 90D
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 15 TOTAL
  1. TOOL · CL_28280 ·

    New Q-value iteration analysis uses switching geometry

    This paper introduces a new framework for analyzing Q-value iteration in Markov decision processes, focusing on a technique called rank-one deflation. The authors interpret the algorithm's behavior through the geometry …

  2. TOOL · CL_22062 ·

    New protocol optimizes drug trial subsidies to boost social utility

    Researchers have developed a new statistical protocol for sequential experimentation that aims to optimize social utility in high-stakes domains like drug development. This protocol involves a product developer conducti…

  3. RESEARCH · CL_21752 ·

    Q-MMR framework offers novel approach to off-policy evaluation

    Researchers have introduced Q-MMR, a new theoretical framework for off-policy evaluation in Markov Decision Processes (MDPs). This method learns weights for data points to approximate expected returns under a target pol…

  4. TOOL · CL_18574 ·

    Reinforcement learning enhances autonomous target tracking accuracy and robustness

    Researchers have developed a deep reinforcement learning approach for autonomous bearings-only tracking of moving targets. The system formulates the observer maneuver problem as a belief Markov decision process, using a…

  5. TOOL · CL_18831 ·

    Reinforcement learning uses symmetry and data augmentation for faster aircraft control

    Researchers have developed a new method for offline reinforcement learning that leverages the symmetry of dynamical systems to improve sample efficiency. This approach uses symmetric data augmentation to enhance the sta…

  6. TOOL · CL_16024 ·

    New metric-normalized posterior leakage (mPL) enhances privacy for joint AI consumption

    Researchers have developed a new privacy metric called Metric-Normalized Posterior Leakage (mPL) to address limitations in existing differential privacy methods, particularly for machine learning systems used under join…

  7. RESEARCH · CL_16067 ·

    New research advances adversarial imitation learning theory and practice

    Two new papers explore the theoretical underpinnings of adversarial imitation learning (AIL), a technique that uses neural networks to learn from expert demonstrations. The first paper introduces OPT-AIL, a framework de…

  8. TOOL · CL_16235 ·

    RAST-MoE-RL framework enhances ride-hailing efficiency with specialized AI experts

    Researchers have developed a new framework called RAST-MoE-RL to improve efficiency in ride-hailing services. This framework utilizes a Mixture-of-Experts (MoE) approach within deep reinforcement learning to better hand…

  9. RESEARCH · CL_14397 ·

    Researchers find random data deletion improves adaptive RL policies

    Researchers have discovered that randomly deleting a portion of training data can significantly improve the performance of adaptive reinforcement learning policies. This counterintuitive technique helps by implicitly do…

  10. RESEARCH · CL_14217 ·

    DRL framework optimizes NR-U/Wi-Fi coexistence for fairness and throughput

    Researchers have developed a policy-driven deep reinforcement learning framework to manage resource allocation between NR-U and Wi-Fi networks operating in unlicensed spectrum. This framework uses a deep Q-network to le…

  11. RESEARCH · CL_11893 ·

    AutoREC platform uses RL agents to generate circuit models from EIS data

    Researchers have developed AutoREC, an open-source Python package designed to automate the generation of equivalent circuit models (ECMs) from electrochemical impedance spectroscopy (EIS) data. This platform utilizes re…

  12. COMMENTARY · CL_11269 ·

    Yann LeCun clarifies technical definition of 'world models' in AI

    Yann LeCun shared a technical discussion regarding the term "world models" in AI. He clarified that in control theory and the context of Markov Decision Processes (MDPs), "world models" specifically refers to transition…

  13. RESEARCH · CL_07013 ·

    AsyncShield: A Plug-and-Play Edge Adapter for Asynchronous Cloud-based VLA Navigation

    Researchers have developed AsyncShield, a new framework designed to improve the navigation capabilities of Vision-Language-Action (VLA) models on mobile robots. This system addresses the latency and network jitter issue…

  14. RESEARCH · CL_05164 ·

    New algorithm identifies near-optimal policies in robust constrained Markov decision processes

    Researchers have developed a novel algorithm to identify near-optimal policies in robust constrained Markov decision processes (RCMDPs). This new method addresses limitations in existing policy gradient approaches that …

  15. RESEARCH · CL_05085 ·

    Researchers develop MDP and POMDP for error mitigation in digital twins

    Researchers have developed a new framework for mitigating error propagation in modular digital twins by treating it as a sequential decision-making problem. They formulated this using a Markov Decision Process (MDP) and…