PulseAugur
实时 12:06:18
English(EN) Flow-Corrected Thompson Sampling for Non-Stationary Contextual Bandits

新的私有汤普森采样算法提升了上下文赌博机的性能

研究人员开发了AdaPrivate-TS,一种新的差分私有上下文赌博机算法,它将汤普森采样与批量zCDP组合相结合。这种方法将添加的高斯噪声解释为不确定性增加,从而提高了性能和隐私保证。在各种数据集上的实验表明,AdaPrivate-TS在不同的隐私预算下实现了高比例的非私有性能,并且优于其他基线,尤其是在应用隐私放大时。 AI

影响 增强了强化学习应用中的隐私性,可能使个性化系统中使用更敏感的数据成为可能。

排序理由 该集群包含一篇研究论文,详细介绍了具有差分隐私的上下文赌博机的新算法。[lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.LG 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。 我们如何撰写摘要 →

新的私有汤普森采样算法提升了上下文赌博机的性能

报道来源 [3]

  1. arXiv cs.LG TIER_1 English(EN) · AmirHossein Naghdi, Ali Baheri ·

    Flow-Corrected Thompson Sampling for Non-Stationary Contextual Bandits

    arXiv:2606.23933v1 Announce Type: cross Abstract: We study non-stationary linear contextual bandits where the reward model drifts over time, rendering classical contextual bandit algorithms brittle because historical data becomes systematically biased. We propose Flow-Corrected T…

  2. arXiv cs.LG TIER_1 English(EN) · Ali Baheri ·

    Flow-Corrected Thompson Sampling for Non-Stationary Contextual Bandits

    We study non-stationary linear contextual bandits where the reward model drifts over time, rendering classical contextual bandit algorithms brittle because historical data becomes systematically biased. We propose Flow-Corrected Thompson Sampling (fcTS), a Bayesian method that re…

  3. arXiv stat.ML TIER_1 English(EN) · Eranga Ukwatta ·

    AdaPrivate-TS:用于具有隐私放大的上下文老虎机问题的私有 Thompson Sampling

    We present AdaPrivate-TS, a differentially private contextual bandit algorithm that combines Thompson Sampling with batched zCDP composition. Our key insight is that differential privacy noise inflates the posterior covariance in a structured way: adding Gaussian noise $N(0,σ^2 I…