PulseAugur / Brief
EN
LIVE 09:10:43

Brief

last 24h
[1/1] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

    Researchers have developed SeedLM, a novel post-training compression technique for large language models that utilizes pseudo-random generator seeds to encode model weights. This method aims to reduce the high runtime costs associated with LLMs by generating weight matrices on-the-fly during inference, thereby decreasing memory access and improving speed for memory-bound tasks. SeedLM achieves this by trading compute for fewer memory accesses and notably does not require calibration data, generalizing well across diverse tasks and maintaining accuracy comparable to FP16 baselines even at significant compression levels. AI

    SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

    IMPACT This compression technique could significantly reduce the deployment costs and increase the inference speed of large language models.