LRU
PulseAugur coverage of LRU — every cluster mentioning LRU across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
KV cache eviction protection proves more vital than scoring
Researchers have developed a new method for managing KV cache eviction in large language models, finding that structural protection is more critical than scoring algorithms. Their study on transformer models revealed th…
-
Looped SSMs improve time series classification with depth-recurrence
Researchers have introduced Looped SSMs, a novel approach to State Space Models for time series classification. This method enhances performance by applying depth-recurrence, where model blocks are reused across layers,…
-
Apple researchers unveil SpecMD for faster MoE model inference
Apple's machine learning research team has published a paper detailing SpecMD, a new framework for evaluating Mixture-of-Experts (MoE) model caching policies. Their experiments show that traditional caching assumptions …
-
New ML-based GPU caching algorithm LCR boosts LLM inference speed
Researchers have developed a new GPU caching algorithm called Learning-Augmented LRU (LALRU) designed to improve efficiency during AI inference. This algorithm integrates learned predictions with caching policies to ens…
-
Memristor-based AI systems show promise for efficient learning and neuromorphic computing
Researchers are exploring Self-Organising Memristive Networks (SOMNs) as a physical alternative to conventional hardware for artificial intelligence, aiming for energy-efficient, brain-like continual learning. These net…