ENTITY Language Modeling

Language Modeling

PulseAugur coverage of Language Modeling — every cluster mentioning Language Modeling across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

4 over 90d

Releases · 30d

0 over 90d

Papers · 30d

4 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

3 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

RESEARCH · CL_115231 · Jun 26 · 06:08

Flexformer introduces learnable attention kernels for efficient Transformers

Researchers have introduced Flexformer, a novel linear Transformer architecture designed to overcome the quadratic complexity limitations of traditional Transformers. Flexformer achieves this by learning attention kerne…
RESEARCH · CL_95870 · Jun 16 · 05:02

Researchers analyze transformer expressivity using formal grammars

A new research paper analyzes the expressivity of deep transformer models by examining their ability to represent hierarchical structures. The study uses bounded-depth, non-recursive context-free grammars to construct t…
RESEARCH · CL_70263 · Jun 4 · 04:00

Transformer study finds QKV projection sharing slashes memory use

Researchers have investigated the necessity of three distinct projections (query, key, and value) in Transformer models. Their study found that sharing projections, particularly the Q-K=V variant, can significantly redu…
RESEARCH · CL_06711 · Apr 28 · 04:00

Switch Attention dynamically routes between full and sliding window attention

Researchers have introduced Switch Attention (SwiAttn), a novel hybrid transformer architecture designed to address the computational bottleneck of standard full attention mechanisms in long-context language modeling. S…

Flexformer introduces learnable attention kernels for efficient Transformers

Researchers analyze transformer expressivity using formal grammars

Transformer study finds QKV projection sharing slashes memory use

Switch Attention dynamically routes between full and sliding window attention