ENTITY Pythia

Pythia

PulseAugur coverage of Pythia — every cluster mentioning Pythia across labs, papers, and developer communities, ranked by signal.

Total · 30d

31

31 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

31

31 over 90d

TIER MIX · 90D

research 15
tool 15
commentary 1

TOPICS

RELATIONSHIPS

affiliated with Olmo 60%

SENTIMENT · 30D

12 day(s) with sentiment data

RECENT · PAGE 1/2 · 31 TOTAL

RESEARCH · CL_109505 · Jun 24 · 17:27

AI models forget learned rules mid-training, research finds

A new research paper introduces the concept of "natural ungrokking," describing how language models can learn a rule during pretraining, only to forget it later without any change in the loss curve. The study found that…
TOOL · CL_104774 · Jun 20 · 03:12

Keyless Attention mechanism halves KV cache and boosts transformer efficiency

Researchers have introduced Keyless Attention, a novel attention mechanism for transformers that eliminates the key projection entirely, operating solely on queries and values. This approach results in a Value-Only Cach…
RESEARCH · CL_96198 · Jun 17 · 04:00

New benchmarks tackle privacy risks in large language models

Researchers have developed new methods to evaluate membership inference attacks (MIAs) against large language models (LLMs), particularly focusing on audio and text modalities. The first study introduces a systematic ev…
COMMENTARY · CL_91578 · Jun 9 · 14:30

AI transparency debate: 'Open weights' insufficient, requires data and value insight

The article "Open Weights, Closed Minds: What AI Transparency Actually Requires" argues that releasing only model weights, a practice termed "open weights," is insufficient for true AI transparency. While this allows us…
TOOL · CL_80064 · Jun 9 · 04:00

LLM function-vector heads split into 'writers' and 'cancellers'

Researchers have identified two distinct populations within function-vector (FV) heads in large language models, challenging the assumption that these heads are a homogeneous group. By employing a sign-preserving criter…
RESEARCH · CL_79130 · Jun 6 · 22:57

New framework predicts side effects of AI model steering

Researchers have developed a new framework to predict side effects of using sparse autoencoders (SAEs) to steer language models. This method analyzes feature statistics before intervention to forecast issues like incons…
TOOL · CL_79195 · Jun 6 · 04:44

LLMs Crystallize Factual Knowledge Late in Layers, Study Finds

Researchers have identified a phenomenon called "Late Crystallization" in large language models, where factual knowledge primarily emerges in the final layers rather than gradually across all layers. This finding, obser…
TOOL · CL_72690 · Jun 5 · 04:00

Study: Language model circuits vary by architecture

A new study published on arXiv investigates how different language model architectures implement similar task functionalities. Researchers found that the specific circuits responsible for task execution vary significant…
TOOL · CL_72637 · Jun 5 · 04:00

New metric predicts language processing costs beyond surprisal

Researchers have introduced a new metric called trajectory extrapolation error to better predict human language processing costs. This metric analyzes the trajectory of hidden states in transformer language models, goin…
RESEARCH · CL_72528 · Jun 4 · 15:10

AI circuit discovery methods may misinterpret structure for function

Researchers have identified a phenomenon called "phantom specialization" in AI models, where variations in input statistics can lead to structurally different circuits that perform the same function. This suggests that …
TOOL · CL_68280 · Jun 3 · 04:00

AI benchmark auditing methods fail under real-world conditions

A new research paper highlights significant issues with current methods for detecting benchmark contamination in large language models. The study, which evaluated 27 models including frontier industry ones, found that c…
TOOL · CL_68279 · Jun 3 · 04:00

Language models fail to transfer reasoning states via direct activation injection

Researchers have investigated whether one language model can directly transfer its internal reasoning states to another model during inference. While a linear translation layer successfully mapped hidden states between …
TOOL · CL_66071 · Jun 2 · 04:00

New BLISS method speeds up LLM pretraining with efficient data selection

Researchers have developed BLISS, a novel method for selecting data to pretrain large language models more efficiently. Unlike previous methods, BLISS does not require external pretrained models and accounts for the lon…
RESEARCH · CL_62923 · Jun 1 · 04:00

New research explores advanced compression techniques for AI models

Researchers are exploring novel methods for compressing large models and datasets to improve efficiency. Papers discuss unifying dataset pruning and distillation, bootstrapped tokenization for image generation, and acti…
TOOL · CL_61794 · May 31 · 13:11

AI models learn same features but in rotated bases, researchers find

Researchers have discovered that while independently trained transformer models of the same architecture learn similar features, their internal activation representations are rotated by a random amount. This "polymorphi…
RESEARCH · CL_62286 · May 29 · 10:34

Language models improve via compatible self-generated data

A new research paper explores the concept of "latent capability resurfacing" in language models, suggesting that self-generated data can improve a model's performance only if it's compatible with the model's existing ca…
TOOL · CL_58814 · May 29 · 04:00

New method combats data laundering in LLM training

A new research paper introduces Synthesis Data Reversion (SDR), a method designed to combat data laundering in Large Language Model (LLM) training. Data laundering involves transforming proprietary data to obscure its o…
RESEARCH · CL_62723 · May 27 · 04:51

LLMs can learn synthetic dishonesty, research finds

Researchers have investigated how Large Language Models (LLMs) can be trained to produce deceptive outputs, even when their internal representations remain honest. Studies using models like Pythia, Gemma, Qwen, and Llam…
RESEARCH · CL_50923 · May 26 · 04:00

New methods unveiled for interpreting transformer attention circuits

Two new research papers propose methods for interpreting the internal workings of transformer models, particularly focusing on their attention mechanisms. The first paper introduces a generic interpretation approach for…
TOOL · CL_50921 · May 26 · 04:00

New theory predicts concept emergence in neural networks

Researchers have developed a bifurcation theory to better understand how neural networks develop structured representations during training. This theory introduces a new, label-free metric called the beta/beta_c ratio, …