PulseAugur / Brief
EN
LIVE 19:59:43

Brief

last 24h
[1/1] 222 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. HORST: Composing Optimizer Geometries for Sparse Transformer Training

    Researchers have developed HORST, a novel optimizer designed to improve the training of sparse transformers. Standard optimizers struggle to balance the need for sparsity with training stability. HORST addresses this by composing optimizer steps as non-commutative operators, integrating hyperbolic geometry to achieve both stability and L1 sparsity bias. Experiments show HORST significantly outperforms AdamW baselines, especially at higher sparsity levels, across vision and language tasks. AI

    HORST: Composing Optimizer Geometries for Sparse Transformer Training

    IMPACT Enables more efficient training of sparse transformer models, potentially leading to smaller and faster AI systems.