Penn Treebank
PulseAugur coverage of Penn Treebank — every cluster mentioning Penn Treebank across labs, papers, and developer communities, ranked by signal.
-
Kan Extension Transformers unify attention, diffusion, and self-conditioning
Researchers have introduced Kan Extension Transformers (KETs), a new framework that unifies various Transformer implementations under a categorical lens. KETs view Transformer layers as weighted structured extension ope…
-
Energy-Gated Attention enhances Transformer models by prioritizing salient tokens
Researchers have introduced Energy-Gated Attention (EGA), a novel mechanism designed to improve transformer models by focusing on spectrally salient tokens. This approach mimics principles from fluid dynamics, prioritiz…
-
Researchers explore weight decay, in-context learning, and acceleration for Transformer models
Researchers have developed several new methods to improve the efficiency and theoretical understanding of Transformer models. One paper provides a functional-analytic characterization of weight decay, demonstrating its …