ruler
PulseAugur coverage of ruler — every cluster mentioning ruler across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
LLM context compaction quality degradation curve observed, lacks benchmarks
A user observed that the output quality of LLMs like DeepSeek V4 and Claude Code does not degrade linearly with repeated context compaction. Instead, there appears to be a temporary improvement after the second compacti…
-
Cursor AI coding tool usage slashed with custom rules and skills
The author details how they significantly reduced their daily usage of AI coding tool Cursor by implementing custom rules and skills. This change was prompted by a realization that their previous approach was inefficien…
-
New framework unifies sequence models using Bayesian memory
Researchers have introduced a "design-model" framework for creating efficient recurrent sequence maps based on memory assumptions. This framework uses Bayesian filtering to write evidence into memory and a query-depende…
-
New metric reveals LLM unlearning methods fail to fully forget sensitive data
A new research paper introduces \"Leak@k\", a metric designed to evaluate the effectiveness of unlearning methods in large language models (LLMs). The study found that most current unlearning techniques fail to complete…
-
New RULER metrics detect residual data in machine unlearning
Researchers have developed RULER, a new set of metrics designed to verify machine unlearning at the representation level. Current methods only check output-level compliance, which can still leave residual information in…
-
New RW-TTT method boosts LLM test-time training efficiency
Researchers have developed a new method called RW-TTT to improve the efficiency of test-time training (TTT) for large language models. TTT allows models to adapt during generation by updating request-specific states, bu…
-
New EXACT method boosts LLM long-context understanding
Researchers have developed a new supervision objective called EXACT to improve long-context adaptation in language models. This method addresses a mismatch in packed training by assigning extra weight to targets that re…
-
FocuSFT improves LLM long-context understanding via bilevel optimization
Researchers have developed FocuSFT, a novel bilevel optimization framework designed to improve how large language models handle long contexts. This method addresses the issue of "attention dilution," where models tend t…
-
New paper proposes residual-mass accounting for partial-KV decoding
Researchers have developed a novel method for partial-KV decoding, which optimizes the efficiency of large language models by only computing exact softmax contributions for a subset of tokens. This approach uses learned…
-
Subquadratic debuts 12M-token context window with linear scaling architecture
Subquadratic, a startup with 11 PhD researchers, has launched a new model featuring its Subquadratic Selective Attention (SSA) architecture, which claims to scale linearly with context length. This innovation allows for…
-
Q-RAG method enables efficient multi-step retrieval for LLMs up to 10M tokens
Researchers have introduced Q-RAG, a novel method for enhancing Retrieval-Augmented Generation (RAG) systems. This approach utilizes reinforcement learning to fine-tune the embedder model for multi-step retrieval, a mor…
-
Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models
Researchers have identified three key design principles crucial for enhancing length generalization in hierarchical sparse attention models. These principles include using an expressive Chunk Encoder with a CLS token fo…
-
SIEVES method boosts multimodal LLM coverage on visual tasks with evidence scoring
Researchers have developed SIEVES, a novel method for improving the reliability of multimodal large language models (MLLMs) in out-of-distribution scenarios. SIEVES works by learning to estimate the quality of visual ev…
-
New methods tackle LLM KV cache compression for long contexts
Multiple research papers released in May and June 2026 propose novel methods for compressing the Key-Value (KV) cache in large language models (LLMs). These techniques aim to reduce the significant memory overhead assoc…