PulseAugur / Brief
EN
LIVE 23:37:35

Brief

last 24h
[1/1] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Chiaroscuro Attention: Spending Compute in the Dark

    Researchers have developed CHIAR-Former, a novel 4-layer transformer model that optimizes compute usage by dynamically routing tokens. Instead of applying self-attention uniformly, CHIAR-Former analyzes token spectral entropy to direct each token to one of three operators: DCT spectral mixing, RBF kernel mixing, or full self-attention. This approach significantly improves performance on large-scale naturalistic text, achieving a 45% perplexity improvement on WikiText-103 with 62.5% fewer attention FLOPs compared to a standard transformer. AI

    IMPACT Introduces a method to significantly reduce computational cost for transformers on large text datasets.