PulseAugur
实时 11:38:50

ScaleAcross Explorer optimizes AI training across data centers

Researchers have developed ScaleAcross Explorer, a novel optimizer designed to enhance the efficiency of large-scale AI model training across multiple data centers and regions. This approach, informed by Meta's production experience, addresses the complexities of distributing hundreds of thousands of GPUs. The optimizer systematically explores parallelism placement, scheduling, and network technologies to achieve significant training speedups, demonstrating up to 64.62% improvement over existing configurations. AI

影响 Optimizes distributed AI training, potentially reducing costs and accelerating frontier model development.

排序理由 The cluster contains an academic paper detailing a new method for optimizing AI model training infrastructure. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

报道来源 [1]

  1. arXiv cs.AI TIER_1 English(EN) · Minghao Li, Alicia Golden, Samuel Hsia, Michael Kuchnik, Adi Gangidi, Xu Zhang, Ashmitha Jeevaraj Shetty, Zachary DeVito, Weiwei Chu, Dong He, Haoci Zhang, Yuchen Hao, Ruoming Pang, James Hongyi Zeng, Ying Zhang, Minlan Yu, Carole-Jean Wu ·

    ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training

    arXiv:2605.24326v1 Announce Type: cross Abstract: The rapid scaling of large language model training requires distributing GPU resources across multiple data center buildings and regions. We refer to such paradigm as "scale-across" training. As infrastructure expands, the system …