Online Convex Optimization with Sublinear Noisy Probes
Researchers have developed a new framework for Online Convex Optimization (OCO) that can improve worst-case regret even with a limited and noisy budget of pairwise probes. The proposed method unifies sublinear best-expert queries and pairwise feedback, showing that a sublinear, noisy probe budget can provably enhance regret in the full feedback OCO regime. The analysis quantifies the benefit of probing through variance reduction and a second-order analysis of Continuous Exponential Weights, yielding tight regret guarantees. AI