PulseAugur
EN
LIVE 09:18:14

New Research Characterizes Delayed Feedback in Stochastic Linear Bandits

Researchers have published a paper detailing near-optimal regret guarantees for stochastic linear bandits with delayed feedback. The study distinguishes between loss-independent and loss-dependent delays, finding that the former incurs only an additive penalty that is dimension-free. In contrast, loss-dependent delays present greater challenges, with penalties scaling with the square root of the dimension, making them significantly harder than in multi-armed bandit scenarios. AI

RANK_REASON The cluster contains a research paper published on arXiv detailing theoretical advancements in machine learning.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Ofir Schlisselberg, Mengxiao Zhang, Yishay Mansour ·

    Near-Optimal Stochastic Linear Bandits with Delay

    arXiv:2606.16656v1 Announce Type: new Abstract: We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed …

  2. arXiv cs.LG TIER_1 English(EN) · Yishay Mansour ·

    Near-Optimal Stochastic Linear Bandits with Delay

    We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed bandits (MAB), and when the linear structure cre…