PulseAugur / Brief
EN
LIVE 01:27:14

Brief

last 24h
[1/1] 221 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

    Researchers have introduced the Shannon Scaling Law, a new theoretical framework for understanding Large Language Model (LLM) training. This model views LLM training as information transmission through a noisy channel, drawing parallels to the Shannon-Hartley theorem. The framework explains non-monotonic phenomena like overtraining and quantization-induced degradation by analyzing the signal-to-noise ratio (SNR) in relation to model capacity and training data. Experiments on Pythia and OLMo2 models demonstrated that the Shannon Scaling Law significantly outperforms existing scaling laws in predicting model performance, even extrapolating to unseen model sizes. AI

    IMPACT Provides a new theoretical lens for understanding LLM scaling, potentially guiding future model development and optimization strategies.