ENTITY LongBench: a bilingual, multitask benchmark for long context understanding

LongBench: a bilingual, multitask benchmark for long context understanding

PulseAugur coverage of LongBench: a bilingual, multitask benchmark for long context understanding — every cluster mentioning LongBench: a bilingual, multitask benchmark for long context understanding across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

20 over 90d

Releases · 30d

0 over 90d

Papers · 30d

17 over 90d

TIER MIX · 90D

research 10
tool 8
commentary 2

TOPICS

paper 17
infra 11
model release 10
safety 2
product 1
other 1

RELATIONSHIPS

used by SnapKV 60%
other SnapKV 50%

SENTIMENT · 30D

8 day(s) with sentiment data

RECENT · PAGE 1/1 · 20 TOTAL

COMMENTARY · CL_163194 · Jul 25 · 15:45

OpenAI subreddit user seeks chat evaluation benchmarks

A user on the r/OpenAI subreddit is seeking recommendations for datasets and benchmarks to evaluate chat model performance. They are specifically interested in measuring multi-turn accuracy and memory management, noting…
RESEARCH · CL_154057 · Jul 21 · 04:00

LLM KV Cache Management: Security Threats and Efficiency Innovations

Researchers are exploring new methods to manage the Key-Value (KV) cache in large language models, which is crucial for inference speed but grows linearly with context length. One approach, "Error Certificates for KV-Ca…
RESEARCH · CL_151862 · Jul 20 · 04:00

New research tackles LLM inference efficiency with novel caching and compression techniques · 5 sources tracked

Several research papers introduce novel techniques to enhance the efficiency of large language model (LLM) inference. SonicSampler offers unified, tile-aware kernels for LLM sampling and speculative verification, achiev…
RESEARCH · CL_131290 · Jul 7 · 11:35

New framework LongCrafter enhances LLM long-context understanding

Researchers have introduced LongCrafter, a novel framework designed to generate diverse and high-quality data for fine-tuning large language models (LLMs) to improve their long-context understanding. This framework addr…
TOOL · CL_123006 · Jul 3 · 04:00

New RAGP method compresses prompts using graph pruning and Lévy walks

Researchers have developed a novel prompt compression technique called RAGP, which models text as a multiplex graph to capture both local syntactic and global semantic relationships. This approach utilizes Lévy walks to…
TOOL · CL_115682 · Jun 29 · 04:00

New RL Framework Optimizes LLM KV Cache for Efficient Inference

Researchers have developed a novel framework called KV Policy (KVP) to address the memory demands of large language models (LLMs) by optimizing the Key-Value (KV) cache. KVP reframes KV cache eviction as a reinforcement…
COMMENTARY · CL_113515 · Jun 27 · 12:23

1M context window is capacity, not capability for LLMs

While large language models now support context windows of up to one million tokens, this capacity does not equate to perfect memory or reasoning. Researchers highlight that models often struggle with information in the…
TOOL · CL_111684 · Jun 26 · 04:00

New SSM adapters outperform LoRA for long-context fine-tuning

Researchers have developed a new parameter-efficient fine-tuning (PEFT) method called Hankel Reduced order Model (HRM) adapters, which utilize state space models (SSMs) for long-context fine-tuning. Unlike traditional P…
RESEARCH · CL_115713 · Jun 25 · 16:16

New attention mechanisms boost LLM efficiency and reduce hallucination · 10 sources tracked

Researchers are developing novel attention mechanisms to improve the efficiency and capabilities of large language models (LLMs) and multimodal large language models (MLLMs). These advancements focus on optimizing spars…
RESEARCH · CL_107863 · Jun 22 · 21:42

Nexus Sampling improves LLM KV cache eviction, reducing memory use

Researchers have developed Nexus Sampling, a novel method for managing KV cache eviction in large language models, particularly for long-context and agentic workloads. This training-free approach pairs Nexus scoring wit…
RESEARCH · CL_106564 · Jun 21 · 08:48

New KV Cache Compression Techniques Boost LLM Inference Performance · 9 sources tracked

Multiple research papers explore novel techniques for optimizing the Key-Value (KV) cache in large language model (LLM) serving to address memory and performance bottlenecks. These methods, including quantization, pruni…
RESEARCH · CL_93251 · Jun 15 · 00:00

New LLM KV Cache Compression Methods Tackle Safety and Efficiency

Researchers are developing new methods to compress the Key-Value (KV) cache in large language models (LLMs) to reduce memory usage and improve inference efficiency. AnchorKV focuses on safety by biasing token retention …
RESEARCH · CL_76817 · Jun 5 · 04:49

New EASE-TTT framework boosts long-context QA for smaller LLMs

Researchers have developed EASE-TTT, a novel framework for improving long-context question answering in smaller language models. This method aligns retrieved evidence chunks with attention mechanisms to guide model adap…
TOOL · CL_68443 · Jun 3 · 04:00

EndPrompt method efficiently extends LLM context windows

Researchers have developed a new method called EndPrompt to efficiently extend the context window of large language models without requiring extensive training on long sequences. This technique involves training with a …
TOOL · CL_38307 · May 18 · 08:41

KV cache eviction protection proves more vital than scoring

Researchers have developed a new method for managing KV cache eviction in large language models, finding that structural protection is more critical than scoring algorithms. Their study on transformer models revealed th…
TOOL · CL_32702 · May 14 · 09:00

EndPrompt method efficiently extends LLM context windows with sparse supervision

Researchers have developed EndPrompt, a novel method to efficiently extend the context window of large language models without requiring extensive training on long sequences. By appending a brief terminal prompt with hi…
TOOL · CL_24313 · May 9 · 16:31

Google's TurboQuant cuts LLM memory use by 6x with no accuracy loss

Google researchers have developed a new technique called TurboQuant that significantly reduces the memory required by large language models. By employing a two-step process involving data rotation and scalar quantizatio…
TOOL · CL_22116 · May 8 · 04:00

New paper proposes residual-mass accounting for partial-KV decoding

Researchers have developed a novel method for partial-KV decoding, which optimizes the efficiency of large language models by only computing exact softmax contributions for a subset of tokens. This approach uses learned…
RESEARCH · CL_14463 · Apr 27 · 04:00

New research explores LLM security, efficiency, and training optimization

Researchers are developing novel methods to enhance the efficiency and security of Large Language Models (LLMs). One approach, "Widening the Gap," exploits outlier injection to compromise LLM quantization, demonstrating…
RESEARCH · CL_39746 · Mar 4 · 00:00

New methods tackle LLM KV cache compression for long contexts

Multiple research papers released in May and June 2026 propose novel methods for compressing the Key-Value (KV) cache in large language models (LLMs). These techniques aim to reduce the significant memory overhead assoc…

OpenAI subreddit user seeks chat evaluation benchmarks

LLM KV Cache Management: Security Threats and Efficiency Innovations

New research tackles LLM inference efficiency with novel caching and compression techniques · 5 sources tracked

New framework LongCrafter enhances LLM long-context understanding

New RAGP method compresses prompts using graph pruning and Lévy walks

New RL Framework Optimizes LLM KV Cache for Efficient Inference

1M context window is capacity, not capability for LLMs

New SSM adapters outperform LoRA for long-context fine-tuning

New attention mechanisms boost LLM efficiency and reduce hallucination · 10 sources tracked

Nexus Sampling improves LLM KV cache eviction, reducing memory use

New KV Cache Compression Techniques Boost LLM Inference Performance · 9 sources tracked

New LLM KV Cache Compression Methods Tackle Safety and Efficiency

New EASE-TTT framework boosts long-context QA for smaller LLMs

EndPrompt method efficiently extends LLM context windows

KV cache eviction protection proves more vital than scoring

EndPrompt method efficiently extends LLM context windows with sparse supervision

Google's TurboQuant cuts LLM memory use by 6x with no accuracy loss

New paper proposes residual-mass accounting for partial-KV decoding

New research explores LLM security, efficiency, and training optimization

New methods tackle LLM KV cache compression for long contexts