ENTITY ruler

ruler

PulseAugur coverage of ruler — every cluster mentioning ruler across labs, papers, and developer communities, ranked by signal.

Total · 30d

6 over 90d

Releases · 30d

0 over 90d

Papers · 30d

6 over 90d

TIER MIX · 90D

RELATIONSHIPS

used by BABILong 70%

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 7 TOTAL

TOOL · CL_28323 · May 11 · 13:23

New EXACT method boosts LLM long-context understanding

Researchers have developed a new supervision objective called EXACT to improve long-context adaptation in language models. This method addresses a mismatch in packed training by assigning extra weight to targets that re…
TOOL · CL_27567 · May 11 · 03:30

FocuSFT improves LLM long-context understanding via bilevel optimization

Researchers have developed FocuSFT, a novel bilevel optimization framework designed to improve how large language models handle long contexts. This method addresses the issue of "attention dilution," where models tend t…
TOOL · CL_22116 · May 8 · 04:00

New paper proposes residual-mass accounting for partial-KV decoding

Researchers have developed a novel method for partial-KV decoding, which optimizes the efficiency of large language models by only computing exact softmax contributions for a subset of tokens. This approach uses learned…
TOOL · CL_19355 · May 6 · 12:15

Subquadratic debuts 12M-token context window with linear scaling architecture

Subquadratic, a startup with 11 PhD researchers, has launched a new model featuring its Subquadratic Selective Attention (SSA) architecture, which claims to scale linearly with context length. This innovation allows for…
TOOL · CL_16230 · May 5 · 04:00

Q-RAG method enables efficient multi-step retrieval for LLMs up to 10M tokens

Researchers have introduced Q-RAG, a novel method for enhancing Retrieval-Augmented Generation (RAG) systems. This approach utilizes reinforcement learning to fine-tune the embedder model for multi-step retrieval, a mor…
RESEARCH · CL_11786 · May 1 · 04:00

Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models

Researchers have identified three key design principles crucial for enhancing length generalization in hierarchical sparse attention models. These principles include using an expressive Chunk Encoder with a CLS token fo…
RESEARCH · CL_08517 · Apr 28 · 16:57

SIEVES method boosts multimodal LLM coverage on visual tasks with evidence scoring

Researchers have developed SIEVES, a novel method for improving the reliability of multimodal large language models (MLLMs) in out-of-distribution scenarios. SIEVES works by learning to estimate the quality of visual ev…

New EXACT method boosts LLM long-context understanding

FocuSFT improves LLM long-context understanding via bilevel optimization

New paper proposes residual-mass accounting for partial-KV decoding

Subquadratic debuts 12M-token context window with linear scaling architecture

Q-RAG method enables efficient multi-step retrieval for LLMs up to 10M tokens

Understanding and Improving Length Generalization in Hierarchical Sparse Attention Models

SIEVES method boosts multimodal LLM coverage on visual tasks with evidence scoring