New Pareto Q-Learning algorithm enhances multi-objective reinforcement learning

By PulseAugur Editorial · [2 sources] · 2026-06-17 14:44

Researchers have introduced Pareto Q-Learning with Reward Machines (PQLRM), a novel multi-objective reinforcement learning algorithm designed for tasks with complex reward structures defined by reward machines. This algorithm integrates Pareto Q-Learning, which handles vector-valued Q-estimates for Pareto front approximation, with enhancements from Q-Learning with Reward Machines that leverage the automaton structure of reward signals. PQLRM aims to achieve sample efficiency in non-Markovian, reward machine-encoded environments and has demonstrated faster convergence and the ability to synthesize Pareto-optimal policies that other methods cannot. AI

IMPACT Enhances sample efficiency and policy synthesis in multi-objective reinforcement learning tasks with complex reward structures.

RANK_REASON The cluster contains a research paper submitted to arXiv detailing a new algorithm in reinforcement learning.

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Arnaud Lequen, Cl\'ement Legrand-Lixon, L\'eo Sauli\`eres · 2026-06-18 04:00

Pareto Q-Learning with Reward Machines

arXiv:2606.19134v1 Announce Type: cross Abstract: We present Pareto Q-Learning with Reward Machines (PQLRM), a multi-objective reinforcement learning algorithm for tasks whose reward structure is specified by a set of reward machines (RMs). PQLRM combines Pareto Q-Learning (PQL),…
arXiv cs.AI TIER_1 English(EN) · Léo Saulières · 2026-06-17 14:44

Pareto Q-Learning with Reward Machines

We present Pareto Q-Learning with Reward Machines (PQLRM), a multi-objective reinforcement learning algorithm for tasks whose reward structure is specified by a set of reward machines (RMs). PQLRM combines Pareto Q-Learning (PQL), which maintains sets of vector-valued Q-estimates…

COVERAGE [2]

Pareto Q-Learning with Reward Machines

Pareto Q-Learning with Reward Machines

RELATED ENTITIES

RELATED TOPICS