Local SGD Worker Disagreement Reveals Deep Neural Network Loss Geometry

By PulseAugur Editorial · [1 sources] · 2026-05-28 04:00

Researchers have developed a novel method to understand the loss geometry of deep neural networks by analyzing worker disagreement in Local Stochastic Gradient Descent (SGD). This disagreement, theoretically shown to be influenced by gradient noise and Hessian curvature, provides a cost-effective, Hessian-free estimator of the dominant subspace of the loss landscape. Experiments with MLPs, CNNs, and Transformers confirm that the subspaces identified through worker-average gaps effectively capture the gradient components within the dominant Hessian eigenspace. AI

RANK_REASON This is a research paper detailing a new method for analyzing deep neural network loss geometry. [lever_c_demoted from research: ic=1 ai=1.0]

Read on arXiv cs.AI →

paper
other

AI-generated summary · Google Gemini · from 1 sources. How we write summaries →

Local SGD Worker Disagreement Reveals Deep Neural Network Loss Geometry

COVERAGE [1]

arXiv cs.AI TIER_1 English(EN) · Tolga Dimlioglu, Kristi Topollai, Anna Choromanska · 2026-05-28 04:00

Worker Disagreement Reveals Sharp Directions in Local SGD

arXiv:2605.27739v1 Announce Type: cross Abstract: Deep neural network training often exhibits highly anisotropic loss geometry, where a few sharp dominant Hessian directions coexist with a large flatter bulk. Gradients tend to align disproportionately with these dominant directio…

COVERAGE [1]

Worker Disagreement Reveals Sharp Directions in Local SGD

RELATED ENTITIES

RELATED TOPICS