PulseAugur
EN
LIVE 06:50:46

New metric CWGD improves optimization noise measurement in ML

Researchers have introduced Curvature-Weighted Gradient Diversity (CWGD), a novel metric designed to better measure optimization noise in machine learning models. Unlike traditional methods that treat all parameter directions equally, CWGD accounts for the impact of curvature, recognizing that high-curvature directions are less sensitive to noise. This new measure, when used in a CWGD-Cosine learning-rate schedule, has demonstrated the potential to reduce final optimization error by approximately 20% compared to standard cosine annealing in quadratic settings, with negligible overhead. AI

IMPACT This new metric could lead to more efficient training of machine learning models by better managing learning rates.

RANK_REASON The cluster contains a research paper detailing a new metric and algorithm for machine learning optimization. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New metric CWGD improves optimization noise measurement in ML

COVERAGE [2]

  1. arXiv stat.ML TIER_1 English(EN) · Muhammad Hamza (Indian Institute of Technology Kharagpur), Ayush Goel (Indian Institute of Technology Kharagpur) ·

    Curvature-Weighted Gradient Diversity: A Noise Measure for Geometry-Adaptive SGD Schedules

    arXiv:2606.30455v1 Announce Type: cross Abstract: The standard convergence analysis of mini-batch stochastic gradient descent (SGD) models gradient noise using a single variance term that treats all parameter directions equally, ignoring the fact that noise in high-curvature dire…

  2. arXiv stat.ML TIER_1 English(EN) · Ayush Goel ·

    Curvature-Weighted Gradient Diversity: A Noise Measure for Geometry-Adaptive SGD Schedules

    The standard convergence analysis of mini-batch stochastic gradient descent (SGD) models gradient noise using a single variance term that treats all parameter directions equally, ignoring the fact that noise in high-curvature directions has less impact because learning rates are …