PulseAugur
实时 10:52:35
English(EN) Near-Optimal Stochastic Linear Bandits with Delay

新研究表征了随机线性赌博机中的延迟反馈

研究人员发表了一篇论文,详细介绍了具有延迟反馈的随机线性赌博机的近乎最优遗憾保证。该研究区分了与损失无关和与损失有关的延迟,发现前者仅会产生一个无维度的加性惩罚。相比之下,与损失有关的延迟带来了更大的挑战,其惩罚与维度呈平方根关系,这使得它们比多臂赌博机场景更难处理。 AI

排序理由 该集群包含一篇在arXiv上发表的研究论文,详细介绍了机器学习方面的理论进展。

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 2 个来源。 我们如何撰写摘要 →

报道来源 [2]

  1. arXiv cs.LG TIER_1 English(EN) · Ofir Schlisselberg, Mengxiao Zhang, Yishay Mansour ·

    Near-Optimal Stochastic Linear Bandits with Delay

    arXiv:2606.16656v1 Announce Type: new Abstract: We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed …

  2. arXiv cs.LG TIER_1 English(EN) · Yishay Mansour ·

    Near-Optimal Stochastic Linear Bandits with Delay

    We study stochastic linear bandits with delayed feedback under several delay models and establish near-optimal regret guarantees. Our results identify when delayed linear bandits exhibit the same qualitative behavior as multi-armed bandits (MAB), and when the linear structure cre…