Researchers have developed a new framework for Online Convex Optimization (OCO) that can improve worst-case regret even with a limited and noisy budget of pairwise probes. The proposed method unifies sublinear best-expert queries and pairwise feedback, showing that a sublinear, noisy probe budget can provably enhance regret in the full feedback OCO regime. The analysis quantifies the benefit of probing through variance reduction and a second-order analysis of Continuous Exponential Weights, yielding tight regret guarantees. AI
RANK_REASON The cluster contains an academic paper detailing a new theoretical framework for Online Convex Optimization.
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →