PulseAugur / Brief
EN
LIVE 11:27:45

Brief

last 24h
[1/1] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

  1. Open sourcing InfiniteKV: a KV cache that files old tokens as 104-byte searchable records in RAM or on disk instead of deleting them. Mistral-7B answered from token 76,747, 2.3x past its trained window. Colab demo

    InfiniteKV is a new KV cache system designed to extend the context window of large language models by storing older tokens in a compressed, searchable format on disk or in RAM. This approach allows models to access information far beyond their original training limits, as demonstrated by Mistral-7B successfully answering a query from token 76,747, significantly past its 32,768 token limit. The system maintains recent tokens in GPU memory for speed while offloading older ones, drastically reducing memory requirements from gigabytes per million tokens to just a few megabytes. AI

    IMPACT Enables LLMs to process and recall information from vastly extended contexts, potentially unlocking new applications in long-form content analysis and generation.