PulseAugur
EN
LIVE 10:49:47

New StoMPP method improves binary neural network training

Researchers have introduced StoMPP (Stochastic Masked Partial Progressive Binarization), a novel training method for binary neural networks (BNNs) that avoids the accuracy degradation typically seen with deeper networks when using the straight-through estimator (STE). StoMPP gradually binarizes network layers from input to output, offering significant accuracy improvements across various architectures like ResNet-50, MobileNetV2, and BERT, even in an STE-free setting. When combined with surrogate gradients, this progressive binarization further enhances performance, with the study highlighting that the order of layer progression is critical for preventing gradient blockades and maintaining network depth. AI

IMPACT This research offers a novel approach to training binary neural networks, potentially enabling more efficient models for deployment on resource-constrained devices.

RANK_REASON The cluster contains a research paper detailing a new method for training binary neural networks.

Read on arXiv cs.LG →

AI-generated summary · Google Gemini · from 2 sources. How we write summaries →

New StoMPP method improves binary neural network training

COVERAGE [2]

  1. arXiv cs.LG TIER_1 English(EN) · Evan Gibson Smith, Bashima Islam ·

    Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks

    arXiv:2606.27759v1 Announce Type: new Abstract: Training binary neural networks (BNNs) from scratch is dominated by the straight-through estimator (STE), whose forward/backward mismatch produces severe accuracy degradation as networks deepen. We study an orthogonal axis: when and…

  2. arXiv cs.LG TIER_1 English(EN) · Bashima Islam ·

    Layerwise Progressive Freezing: A Training Scaffold for Depth-Scalable Binary Networks

    Training binary neural networks (BNNs) from scratch is dominated by the straight-through estimator (STE), whose forward/backward mismatch produces severe accuracy degradation as networks deepen. We study an orthogonal axis: when and where binarization is enforced during training.…