PulseAugur
EN
LIVE 12:55:53

New SDP framework cuts model training memory use by up to 60%

Researchers have developed a new distributed training framework called Subnetwork Data Parallelism (SDP) to address the high memory demands and communication costs associated with pre-training large neural networks. SDP partitions models into structured subnetworks that can be trained across workers without exchanging activations, significantly reducing per-device memory usage. The framework employs backward and forward masking techniques, along with neuron or block-level construction strategies, to achieve efficiency gains and improved performance in FLOP-matched settings. AI

IMPACT Reduces memory requirements for training large models, potentially enabling more efficient development and deployment of AI.

RANK_REASON This is a research paper detailing a new method for distributed training of neural networks. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

COVERAGE [1]

  1. arXiv cs.AI TIER_1 English(EN) · Vaibhav Singh, Zafir Khalid, Pietro Cagnasso, Edouard Oyallon, Eugene Belilovsky ·

    Model Parallelism With Subnetwork Data Parallelism

    arXiv:2507.09029v5 Announce Type: replace-cross Abstract: Pre-training large neural networks at scale imposes heavy memory demands on accelerators and often requires costly communication. We introduce Subnetwork Data Parallelism (SDP), a distributed training framework that partit…