PulseAugur
EN
LIVE 04:40:33

New HybridSGD method optimizes distributed-memory AI training

Researchers have developed HybridSGD, a novel 2D parallel stochastic gradient descent method designed to optimize performance in distributed-memory systems. This new approach offers a continuous trade-off between existing 1D methods like s-step SGD and Federated SGD with Averaging (FedAvg). Theoretical analysis confirms HybridSGD's advantages in convergence, computation, communication, and memory usage. Empirical evaluations on a Cray EX supercomputing system demonstrated that HybridSGD achieves better convergence than FedAvg and significant speedups over both s-step SGD and FedAvg when applied to binary classification tasks. AI

IMPACT This research could lead to more efficient training of large AI models on distributed computing systems.

RANK_REASON The cluster contains an academic paper detailing a new algorithm for distributed optimization. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv stat.ML →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

New HybridSGD method optimizes distributed-memory AI training

COVERAGE [1]

  1. arXiv stat.ML TIER_1 English(EN) · Aditya Devarakonda, Ramakrishnan Kannan ·

    Communication-Efficient, 2D Parallel Stochastic Gradient Descent for Distributed-Memory Optimization

    arXiv:2501.07526v2 Announce Type: replace-cross Abstract: Distributed-memory implementations of numerical optimization algorithm, such as stochastic gradient descent (SGD), require interprocessor communication at every iteration of the algorithm. On modern distributed-memory clus…