A new paper demonstrates that the AdaGrad optimization algorithm does not adapt to Hölder-smoothness for composite objectives. The research highlights a specific convex composite optimization problem where AdaGrad fails to achieve the expected convergence rate. This occurs because the gradient of the smooth term may not vanish at the optimum, leading AdaGrad to excessively reduce its stepsize and slow down convergence. The paper also suggests alternative accumulation mechanisms that avoid this issue. AI
RANK_REASON Academic paper detailing a theoretical limitation of an optimization algorithm. [lever_c_demoted from research: ic=1 ai=1.0]
AI-generated summary · Google Gemini · from 2 sources. How we write summaries →