ENTITY WikiText-103

WikiText-103

PulseAugur coverage of WikiText-103 — every cluster mentioning WikiText-103 across labs, papers, and developer communities, ranked by signal.

Total · 30d

6 over 90d

Releases · 30d

0 over 90d

Papers · 30d

6 over 90d

TIER MIX · 90D

RECENT · PAGE 1/1 · 6 TOTAL

RESEARCH · CL_21794 · May 7 · 15:23

New parameter E predicts Mixture-of-Experts model health, preventing dead experts.

Researchers have introduced a new dimensionless control parameter, E = T*H/(O+B), to predict the health of expert ecologies in Mixture-of-Experts (MoE) models. This parameter, derived from four hyperparameters, can prev…
RESEARCH · CL_20402 · May 7 · 04:00

Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks

Researchers have introduced Jordan-RoPE, a novel relative positional encoding method for transformer models that utilizes complex Jordan blocks. This approach generates oscillatory-polynomial features, enabling a distan…
TOOL · CL_18622 · May 6 · 04:00

New framework uses masked language models for efficient wireless token communication

Researchers have developed a novel context-aware wireless token communication framework that utilizes a masked language model (MLM) to improve transmission efficiency. This system enables robust token inference over noi…
RESEARCH · CL_15913 · May 5 · 04:00

Researchers explore weight decay, in-context learning, and acceleration for Transformer models

Researchers have developed several new methods to improve the efficiency and theoretical understanding of Transformer models. One paper provides a functional-analytic characterization of weight decay, demonstrating its …
RESEARCH · CL_08625 · Apr 29 · 04:00

Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space

Researchers have introduced a novel complex-valued sequence model called Phase-Associative Memory (PAM) that utilizes a Hilbert space formalism to better capture the indeterminate nature of semantic expression meaning. …
RESEARCH · CL_06744 · Apr 28 · 04:00

AutoCompress method isolates critical transformer layers for efficient compression

Researchers have developed AutoCompress, a novel method for compressing transformer models by isolating and preserving the critical first layer (Layer 0). This approach, termed Critical Layer Isolation (CLI), showed tha…

New parameter E predicts Mixture-of-Experts model health, preventing dead experts.

Jordan-RoPE: Non-Semisimple Relative Positional Encoding via Complex Jordan Blocks

New framework uses masked language models for efficient wireless token communication

Researchers explore weight decay, in-context learning, and acceleration for Transformer models

Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space

AutoCompress method isolates critical transformer layers for efficient compression