PulseAugur
EN
LIVE 17:23:38

New research details feature learning in wide neural networks

A new research paper explores feature learning in wide two-layer neural networks under the Maximal Update Parametrization ($\mu$P). The study establishes four key structural results, including the global existence and uniqueness of the mean-field limit for noisy gradient descent. It also characterizes the identifiability of this limit and demonstrates that the active support of the long-time limit measure admits a sparse-dictionary decomposition under specific conditions. The research further decomposes the total feature-learning error into several components, offering a detailed analysis of the learning process. AI

IMPACT This research provides theoretical insights into the feature learning capabilities of wide neural networks, potentially informing future model architectures and training methodologies.

RANK_REASON The cluster contains an academic paper detailing theoretical research in machine learning.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New research details feature learning in wide neural networks

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Akmal Xodarev ·

    Feature Learning in Wide Neural Networks under $\mu$P: Identifiability and Sparse-Dictionary Decomposition of the Mean-Field Limit

    arXiv:2605.24710v1 Announce Type: new Abstract: We establish four structural results for feature learning in wide two-layer neural networks under the Maximal Update Parametrization ($\mu$P). First, we prove global existence and uniqueness of the mean-field limit of noisy gradient…

  2. arXiv stat.ML TIER_1 English(EN) · Akmal Xodarev ·

    Feature Learning in Wide Neural Networks under $μ$P: Identifiability and Sparse-Dictionary Decomposition of the Mean-Field Limit

    We establish four structural results for feature learning in wide two-layer neural networks under the Maximal Update Parametrization ($μ$P). First, we prove global existence and uniqueness of the mean-field limit of noisy gradient descent under $μ$P, identifying the maximal admis…