Research paper analyzes compute efficiency and runtime tradeoffs for momentum methods

By PulseAugur Editorial · [2 sources] · 2026-06-17 15:19

A new research paper explores the tradeoffs between serial runtime and compute efficiency for stochastic momentum methods like Heavy Ball (HB) and Accelerated SGD (ASGD). The study proves finite-dimensional lower bounds on batch-size tradeoffs, indicating that HB does not inherently improve compute efficiency over standard SGD for arbitrary spectra. Instead, HB preserves SGD-level efficiency over a larger batch-size window, enabling reduced serial runtime. ASGD's performance is spectrum-dependent, offering improved small-batch compute efficiency for rapidly decaying spectra but trading this for serial runtime as batch size increases. AI

IMPACT This research provides theoretical insights into optimizing training efficiency for large-scale machine learning models.

RANK_REASON The cluster contains academic papers detailing research findings on stochastic momentum methods.

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

COVERAGE [2]

arXiv cs.AI TIER_1 English(EN) · Depen Morwani, Alexandru Meterez, Pranav Nair, Sham Kakade · 2026-06-18 04:00

Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

arXiv:2606.19179v1 Announce Type: cross Abstract: Stochastic momentum methods such as heavy ball (HB), Nesterov momentum, and variants of Accelerated SGD (ASGD) [Kidambi et al., 2018] are widely used in modern training, but their stochastic benefits depend on two distinct quantit…
arXiv stat.ML TIER_1 English(EN) · Sham Kakade · 2026-06-17 15:19

Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

Stochastic momentum methods such as heavy ball (HB), Nesterov momentum, and variants of Accelerated SGD (ASGD) [Kidambi et al., 2018] are widely used in modern training, but their stochastic benefits depend on two distinct quantities: serial runtime, the number of iterations need…

COVERAGE [2]

Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

Compute Efficiency and Serial Runtime Tradeoffs for Stochastic Momentum Methods

RELATED ENTITIES

RELATED TOPICS