ENTITY The Pile

The Pile

PulseAugur coverage of The Pile — every cluster mentioning The Pile across labs, papers, and developer communities, ranked by signal.

Total · 30d

4

4 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

4

4 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

RESEARCH · CL_106759 · Jun 17 · 00:00

New LLM Training Methods Optimize Data Scheduling for Efficiency and Performance

Researchers have developed new methods for optimizing the training of large language models (LLMs) through advanced data scheduling techniques. One approach, the Holistic Data Scheduler (HDS), uses multi-objective reinf…
RESEARCH · CL_65623 · Jun 1 · 15:26

Researchers track attention circuit formation in 1B-class language models

A new research paper investigates the emergence of attention circuits in language models, specifically tracking how different types of attention heads form across various model architectures and training datasets. The s…
RESEARCH · CL_16916 · May 5 · 17:37

New VPD method decomposes language model parameters, improving interpretability

Researchers have introduced adVersarial Parameter Decomposition (VPD), an improved method for interpreting language model parameters. This new technique builds upon previous work like Stochastic Parameter Decomposition …
RESEARCH · CL_00875 · May 15 · 00:00

RWKV project revives RNNs to challenge Transformer dominance in LLMs

The RWKV (Receptance Weighted Key Value) project introduces a novel architecture that revives Recurrent Neural Networks (RNNs) while incorporating advantages typically found in Transformers. This approach aims to overco…