PulseAugur
EN
LIVE 08:34:15

Research paper analyzes compute efficiency and runtime tradeoffs for momentum methods

A new research paper explores the tradeoffs between serial runtime and compute efficiency for stochastic momentum methods like Heavy Ball (HB) and Accelerated SGD (ASGD). The study proves finite-dimensional lower bounds on batch-size tradeoffs, indicating that HB does not inherently improve compute efficiency over standard SGD for arbitrary spectra. Instead, HB preserves SGD-level efficiency over a larger batch-size window, enabling reduced serial runtime. ASGD's performance is spectrum-dependent, offering improved small-batch compute efficiency for rapidly decaying spectra but trading this for serial runtime as batch size increases. AI

IMPACT This research provides theoretical insights into optimizing training efficiency for large-scale machine learning models.

RANK_REASON The cluster contains academic papers detailing research findings on stochastic momentum methods.

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

  1. arXiv cs.AI TIER_1 English(EN) · Depen Morwani, Alexandru Meterez, Pranav Nair, Sham Kakade ·

    Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

    arXiv:2606.19179v1 Announce Type: cross Abstract: Stochastic momentum methods such as heavy ball (HB), Nesterov momentum, and variants of Accelerated SGD (ASGD) [Kidambi et al., 2018] are widely used in modern training, but their stochastic benefits depend on two distinct quantit…

  2. arXiv stat.ML TIER_1 English(EN) · Sham Kakade ·

    Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

    Stochastic momentum methods such as heavy ball (HB), Nesterov momentum, and variants of Accelerated SGD (ASGD) [Kidambi et al., 2018] are widely used in modern training, but their stochastic benefits depend on two distinct quantities: serial runtime, the number of iterations need…