Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 23h

Provably Convergent Actor-Critic for MARL through Risk-aversion

Researchers have developed a new actor-critic algorithm for multi-agent reinforcement learning (MARL) that addresses the challenge of learning stationary policies in general-sum Markov games. The algorithm leverages the concept of Risk-averse Quantal response Equilibria (RQE), which incorporates risk aversion and bounded rationality, to ensure convergence. Theoretical guarantees and empirical validation demonstrate its superior performance compared to risk-neutral methods. AI

IMPACT Introduces a novel theoretical framework and algorithm for improving multi-agent reinforcement learning convergence, potentially impacting complex coordination tasks.

arXiv
Yizhou Zhang