PulseAugur / Brief
EN
LIVE 04:00:00

Brief

last 24h
[1/1] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. QAT MTP Heads Upload + PARALLEL=2 Fix + 12B 2-slot Bench

    The Gemma 4 QAT MTP assistant heads have been released on HuggingFace, offering improved performance for speculative decoding. These heads are specifically trained to match the quantized weights of the Gemma 4 models, significantly increasing acceptance rates compared to non-QAT matched heads. Additionally, a critical crash bug in the llama.cpp implementation when using two parallel processing threads has been identified and fixed, improving stability for local LLM inference. AI

    IMPACT Enables more efficient local inference for Gemma 4 models by providing optimized components and fixing critical bugs.