English(EN) Near-Optimal Stochastic Linear Bandits with Delay

新研究表征了随机线性赌博机中的延迟反馈

作者 PulseAugur 编辑部 · [2 个来源] · 2026-06-15 12:48

研究人员发表了一篇论文，详细介绍了具有延迟反馈的随机线性赌博机的近乎最优遗憾保证。该研究区分了与损失无关和与损失有关的延迟，发现前者仅会产生一个无维度的加性惩罚。相比之下，与损失有关的延迟带来了更大的挑战，其惩罚与维度呈平方根关系，这使得它们比多臂赌博机场景更难处理。 AI

排序理由该集群包含一篇在arXiv上发表的研究论文，详细介绍了机器学习方面的理论进展。

AI 生成摘要 · Google Gemini · 来自 2 个来源。我们如何撰写摘要 →

报道来源 [2]

arXiv cs.LG TIER_1 English(EN) · Ofir Schlisselberg, Mengxiao Zhang, Yishay Mansour · 2026-06-16 04:00

Near-Optimal Stochastic Linear Bandits with Delay

arXiv:2606.16656v1 Announce Type: new Abstract: We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed …
arXiv cs.LG TIER_1 English(EN) · Yishay Mansour · 2026-06-15 12:48

Near-Optimal Stochastic Linear Bandits with Delay

We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed bandits (MAB), and when the linear structure cre…