Researchers propose per-sample clipping for robust and fast AI model training

作者 PulseAugur 编辑部 · [3 个来源] · 2026-05-04 15:11

Researchers have developed a new training method called per-sample clipped SGD (PS-Clip-SGD) that improves robustness and speed for non-convex optimization problems. This method offers theoretical guarantees for convergence even with heavy-tailed gradient noise. Empirical tests showed PS-Clip-SGD outperformed standard techniques when training AlexNet on CIFAR-100, and it also demonstrated benefits when used with gradient accumulation. AI

影响 Introduces a novel training technique that could lead to more efficient and stable model development.

排序理由 Academic paper detailing a new optimization method for machine learning.

在 arXiv stat.ML 阅读 →

AI 生成摘要 · Google Gemini · 来自 3 个来源。我们如何撰写摘要 →

报道来源 [3]

Hugging Face Daily Papers TIER_1 English(EN) · 2026-05-04 15:11

Robust and Fast Training via Per-Sample Clipping

We propose a robust gradient estimator based on per-sample gradient clipping and analyze its properties both theoretically and empirically. We show that the resulting method, per-sample clipped SGD (PS-Clip-SGD), achieves optimal in-expectation convergence rates for non-convex op…
arXiv stat.ML TIER_1 English(EN) · Davide Nobile, Philipp Grohs · 2026-05-05 04:00

Robust and Fast Training via Per-Sample Clipping

arXiv:2605.02701v1 Announce Type: cross Abstract: We propose a robust gradient estimator based on per-sample gradient clipping and analyze its properties both theoretically and empirically. We show that the resulting method, per-sample clipped SGD (PS-Clip-SGD), achieves optimal …
arXiv stat.ML TIER_1 English(EN) · Philipp Grohs · 2026-05-04 15:11

Robust and Fast Training via Per-Sample Clipping

We propose a robust gradient estimator based on per-sample gradient clipping and analyze its properties both theoretically and empirically. We show that the resulting method, per-sample clipped SGD (PS-Clip-SGD), achieves optimal in-expectation convergence rates for non-convex op…

报道来源 [3]

Robust and Fast Training via Per-Sample Clipping

Robust and Fast Training via Per-Sample Clipping

Robust and Fast Training via Per-Sample Clipping

相关实体

相关话题