PulseAugur
LIVE 03:33:33
research · [2 sources] ·
0
research

New algorithms improve online learning with side-observation graphs

Researchers have developed new algorithms for adversarial multi-armed bandit problems where partial loss information is available. These algorithms are designed to handle scenarios where non-chosen arms reveal their losses with a fixed, unknown probability. The proposed methods achieve regret bounds that are close to optimal, even without knowing the exact probability of loss observation. AI

Summary written by gemini-2.5-flash-lite from 2 sources. How we write summaries →

IMPACT Introduces novel algorithms for bandit problems with partial feedback, potentially improving decision-making in online learning systems.

RANK_REASON Academic paper published on arXiv detailing new algorithms for a specific machine learning problem.

Read on arXiv stat.ML →

COVERAGE [2]

  1. arXiv stat.ML TIER_1 · Tom\'a\v{s} Koc\'ak, Gergely Neu, Michal Valko ·

    Online learning with Erd\H{o}s-R\'enyi side-observation graphs

    arXiv:2604.25271v1 Announce Type: new Abstract: We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with a fixed…

  2. arXiv stat.ML TIER_1 · Michal Valko ·

    Online learning with Erdős-Rényi side-observation graphs

    We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with a fixed but unknown probability $r$, independently of e…