LongBench: a bilingual, multitask benchmark for long context understanding
PulseAugur coverage of LongBench: a bilingual, multitask benchmark for long context understanding — every cluster mentioning LongBench: a bilingual, multitask benchmark for long context understanding across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
KV cache eviction protection proves more vital than scoring
Researchers have developed a new method for managing KV cache eviction in large language models, finding that structural protection is more critical than scoring algorithms. Their study on transformer models revealed th…
-
EndPrompt method efficiently extends LLM context windows with sparse supervision
Researchers have developed EndPrompt, a novel method to efficiently extend the context window of large language models without requiring extensive training on long sequences. By appending a brief terminal prompt with hi…
-
Google's TurboQuant cuts LLM memory use by 6x with no accuracy loss
Google researchers have developed a new technique called TurboQuant that significantly reduces the memory required by large language models. By employing a two-step process involving data rotation and scalar quantizatio…
-
New paper proposes residual-mass accounting for partial-KV decoding
Researchers have developed a novel method for partial-KV decoding, which optimizes the efficiency of large language models by only computing exact softmax contributions for a subset of tokens. This approach uses learned…
-
New research explores LLM security, efficiency, and training optimization
Researchers are developing novel methods to enhance the efficiency and security of Large Language Models (LLMs). One approach, "Widening the Gap," exploits outlier injection to compromise LLM quantization, demonstrating…