PulseAugur
EN
LIVE 19:08:23

New algorithms improve online learning with side-observation graphs

Researchers have developed new algorithms for adversarial multi-armed bandit problems where partial loss information is available. These algorithms are designed to handle scenarios where non-chosen arms reveal their losses with a fixed, unknown probability. The proposed methods achieve regret bounds that are close to optimal, even without knowing the exact probability of loss observation. AI

IMPACT Introduces novel algorithms for bandit problems with partial feedback, potentially improving decision-making in online learning systems.

RANK_REASON Academic paper published on arXiv detailing new algorithms for a specific machine learning problem.

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New algorithms improve online learning with side-observation graphs

COVERAGE [2]

  1. arXiv stat.ML TIER_1 English(EN) · Tom\'a\v{s} Koc\'ak, Gergely Neu, Michal Valko ·

    Online learning with Erd\H{o}s-R\'enyi side-observation graphs

    arXiv:2604.25271v1 Announce Type: new Abstract: We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with a fixed…

  2. arXiv stat.ML TIER_1 English(EN) · Michal Valko ·

    Online learning with Erdős-Rényi side-observation graphs

    We consider adversarial multi-armed bandit problems where the learner is allowed to observe losses of a number of arms beside the arm that it actually chose. We study the case where all non-chosen arms reveal their loss with a fixed but unknown probability $r$, independently of e…