PulseAugur
EN
LIVE 05:45:07
ENTITY Sparse Autoencoders

Sparse Autoencoders

PulseAugur coverage of Sparse Autoencoders — every cluster mentioning Sparse Autoencoders across labs, papers, and developer communities, ranked by signal.

Show in brief
Total · 30d
47
47 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
47
47 over 90d
TIER MIX · 90D
TOPICS
RELATIONSHIPS
TIMELINE
  1. 2026-05-25 research_milestone Researchers published a paper detailing a new method for multilingual language steering in LLMs using sparse autoencoders. source
  2. 2026-05-21 research_milestone Researchers published a paper detailing a new method for multilingual steering in LLMs using sparse autoencoders. source
SENTIMENT · 30D

13 day(s) with sentiment data

RECENT · PAGE 1/3 · 47 TOTAL
  1. RESEARCH · CL_111220 ·

    LLMs improved for forecasting via feature steering

    Researchers have developed a method to improve the generalization capabilities of Large Language Models (LLMs) in forecasting tasks. By analyzing LLM internal states with sparse autoencoders, they identified features re…

  2. RESEARCH · CL_107742 ·

    New research explores sparse autoencoders for AI interpretability and generalization

    Researchers are exploring sparse autoencoders (SAEs) for interpreting complex language and vision models. One paper introduces Qwen3-Instruct SAEs for various Qwen3 model sizes, demonstrating their use in steering model…

  3. RESEARCH · CL_106825 ·

    New research probes interpretability of AI location encoders · 2 sources tracked

    Two new research papers explore the interpretability and spatial effect capture of location encoders used in machine learning. The first paper analyzes geographic implicit neural representations, decomposing location em…

  4. TOOL · CL_105118 ·

    Chemical language models' internal representations analyzed with sparse autoencoders

    A new research paper explores the internal workings of chemical language models (cLMs) by applying sparse autoencoders (SAEs) to MolFormer. The study reveals that early layers of the model focus on syntactic patterns an…

  5. RESEARCH · CL_98104 ·

    New framework certifies interpretability of Sparse Autoencoders in language models

    Researchers have developed a new framework to certify the interpretability of Sparse Autoencoders (SAEs) when used with language models. This framework establishes an upper bound on the risk of a language model by using…

  6. RESEARCH · CL_95864 ·

    New research tackles VLM hallucinations, distillation, and interpretability

    Researchers are developing new methods to improve the capabilities and reliability of vision-language models (VLMs). One approach, DCLA, focuses on mitigating hallucinations by ensuring consistency across different laye…

  7. TOOL · CL_93358 ·

    New CSAE Method Unlocks Hierarchical Visual Concepts in LLMs

    Researchers have developed cascaded sparse autoencoders (CSAEs) to better interpret the visual representations within multimodal large language models (MLLMs). Unlike previous methods that produced flat feature dictiona…

  8. RESEARCH · CL_98012 ·

    AI model interventions unreliable, new research finds

    A new research paper demonstrates that interventions designed to suppress undesirable behaviors in AI models by manipulating Sparse Autoencoder (SAE) features are unreliable. The study shows that even when specific SAE …

  9. TOOL · CL_91333 ·

    New AI method audits protein models for hazardous designs

    Researchers have developed VFUSE, a novel approach using Sparse Autoencoders (SAEs) to interpret generative protein models like RoseTTAFold3 and RFDiffusion3. This method aims to identify and understand features associa…

  10. TOOL · CL_91442 ·

    New method improves neural network interpretability by addressing dense activations

    Researchers have proposed a new method to improve the interpretability of neural networks by questioning the assumption that all activation content can be sparsely decomposed. They hypothesize that activations contain a…

  11. RESEARCH · CL_84409 ·

    Sparse autoencoders show unstable features form reproducible subspaces

    Researchers have investigated the reproducibility of features learned by sparse autoencoders (SAEs), a common tool for interpreting neural network representations. Their study reveals that while individual features can …

  12. RESEARCH · CL_91462 ·

    New research enhances sparse autoencoder interpretability and robustness

    Researchers are exploring new methods to improve the interpretability and robustness of sparse autoencoders (SAEs). One approach, GRILL, aims to reveal hidden vulnerabilities in autoencoders by restoring degraded gradie…

  13. TOOL · CL_79921 ·

    AI concept learning unified by geometric framework

    Researchers have developed a geometric framework that unifies supervised and unsupervised concept learning in AI models. This approach views both Concept Bottleneck Models (CBMs) and Sparse Autoencoders (SAEs) as learni…

  14. TOOL · CL_77263 ·

    New ViSAE toolbox interprets and steers Vision Transformer models

    Researchers have developed ViSAE, a new toolbox designed to interpret and steer the behavior of Vision Transformers (ViTs). Inspired by neuroscience, ViSAE uses sparse autoencoders to decompose ViT representations into …

  15. RESEARCH · CL_79165 ·

    New framework enhances LLM interpretability with self-correcting explanations

    Researchers have introduced SAEExplainer, a new framework designed to improve the interpretability of Sparse Autoencoders (SAEs) within large language models. This method uses activation scores as a reward signal to ena…

  16. TOOL · CL_68436 ·

    New metric measures LLM ideological depth and refusal causes

    Researchers have introduced a new metric called "ideological depth" to measure the internal political representations within large language models. This metric assesses a model's ability to follow political instructions…

  17. RESEARCH · CL_68434 ·

    LLM research probes in-context learning mechanisms

    Two new research papers explore the mechanisms behind in-context learning in large language models. One paper investigates whether transformer activations can be used to optimize in-context sample selection, finding tha…

  18. TOOL · CL_65820 ·

    Sparse autoencoders enable interpretable emotion control in TTS

    Researchers have developed a new method for controlling emotions in text-to-speech (TTS) systems by utilizing sparse autoencoders (SAEs) to identify and manipulate latent features within large language models. This appr…

  19. RESEARCH · CL_66057 ·

    New theory explains how Sparse Autoencoders structure interpretable representations

    A new research paper explores the theoretical underpinnings of Sparse Autoencoders (SAEs), a technique used to interpret complex neural network representations. The study proposes a framework to understand what SAEs ext…

  20. RESEARCH · CL_65976 ·

    Research questions stability of Archetypal SAEs for concept extraction

    A new research paper challenges the stability claims of Archetypal Sparse Autoencoders (SAEs), a method designed for more reliable concept extraction in neural networks. The study demonstrates that the reported stabilit…