PulseAugur
实时 12:17:04

New research links optimizers to mode connectivity in neural networks

Researchers have explored the role of optimizers in mode connectivity within neural networks, a concept previously underexplored. Their work demonstrates that solutions generated by a single optimizer, such as AdamW or Muon, form a connected set in two-layer ReLU networks at sufficient width. The study further characterizes how regions from different optimizers interact, showing they can be disjoint or overlapping depending on regularization and network width. Empirical tests on GPT-2 pretraining revealed that paths using the same optimizer maintain spectral properties, while cross-optimizer paths exhibit smoother transitions, highlighting optimizer-dependent structures. AI

影响 Reveals optimizer-dependent structure in model training, potentially influencing future optimization techniques for large models.

排序理由 Academic paper detailing novel findings on optimizer-induced mode connectivity in neural networks. [lever_c_demoted from research: ic=1 ai=1.0]

在 arXiv cs.AI 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

New research links optimizers to mode connectivity in neural networks

报道来源 [1]

  1. arXiv cs.AI TIER_1 English(EN) · Mert Pilanci ·

    Optimizer-Induced Mode Connectivity: From AdamW to Muon

    Mode connectivity has been widely studied, yet the role of the optimizer remains underexplored. We revisit it through optimizer-induced implicit regularization, asking how connectivity behaves when restricted to solutions constrained by a given optimizer. For two-layer ReLU netwo…