ENTITY
KV cache quantization
KV cache quantization
PulseAugur coverage of KV cache quantization — every cluster mentioning KV cache quantization across labs, papers, and developer communities, ranked by signal.
Total · 30d
2
2 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
0
0 over 90d
TIER MIX · 90D
TOPICS
SENTIMENT · 30D
2 day(s) with sentiment data
RECENT · PAGE 1/1 · 2 TOTAL
-
Local LLM users report JSON errors with large context
Users on the r/LocalLLaMA subreddit are encountering JSON parsing errors, specifically "syntax error while parsing value - invalid string: missing closing quote; last read." This issue appears to be linked to the contex…
-
Together AI open-sources OSCAR for efficient LLM serving
Together AI has open-sourced OSCAR, a new system for 2-bit KV cache quantization. This technique aims to improve the efficiency of serving large language models, particularly those with long context windows. The develop…