Two-Action Apple Tasting with Switching Costs
Researchers have analyzed the "two-action apple-tasting problem" with switching costs, a scenario relevant to machine learning algorithms. They found that the expected regret for this problem is bounded by $\sqrt{T}$, which is better than the previously assumed $\widetilde O(T^{2/3})$ bound. This finding removes a potential obstruction in the classification of feedback-graph algorithms. AI
IMPACT Establishes a tighter theoretical bound for a class of learning algorithms, potentially influencing future algorithm design.