Brief · PulseAugur

TOOL · arXiv cs.LG English(EN) · 8h

Near-Optimal Regret for Distributed Adversarial Bandits: A Black-Box Approach

Researchers have developed a new approach to distributed adversarial bandits, improving upon previous regret bounds. The method utilizes a black-box reduction to bandits with delayed feedback, requiring only gossip-based communication among agents. This new algorithm achieves a significantly better upper bound than prior work and is complemented by a matching lower bound, demonstrating the problem's decomposition into communication and bandit costs. The framework is also versatile, yielding bounds for distributed linear bandits with reduced communication overhead. AI

arXiv
cs.LG
Hao Qiu
Vojnovic