Gemma 2-2B-it
PulseAugur coverage of Gemma 2-2B-it — every cluster mentioning Gemma 2-2B-it across labs, papers, and developer communities, ranked by signal.
2 day(s) with sentiment data
-
Language model interpretability: Detection and steering are not aligned
Researchers have investigated the relationship between knowing a behavior's representation in a language model and the ability to steer that behavior. They found that the direction used to detect a behavior, such as hal…
-
New Diffusion Transformers Advance Image Generation and Transmission
Researchers are developing new diffusion transformer models for advanced image generation and transmission. One approach, DDM-SSCC, adapts diffusion language models for lossless pixel-level image transmission, outperfor…
-
New Audit Method Reveals Inconsistent AI Model Refusals to Hazardous Content
A new research paper introduces BioRefusalAudit, a method to evaluate the robustness of AI model refusals to hazardous content. The study found that many models' refusals are inconsistent, collapsing under minor prompt …
-
New research explores efficient and robust machine unlearning techniques
Researchers are developing new methods for machine unlearning, which aims to remove specific data's influence from trained models without full retraining. Several papers propose novel techniques to achieve more efficien…
-
SANA-WM model generates minute-long 720p videos
Researchers have released SANA-WM, an open-source world model capable of generating minute-long videos at 720p resolution. This diffusion transformer model utilizes a hybrid linear attention mechanism and a dual-branch …
-
New method simplifies language model interpretability
Researchers have introduced Exemplar Partitioning (EP), a new method for mechanistic interpretability in language models that offers a more streamlined approach than existing dictionary-learning techniques like sparse a…
-
New methods enhance LLM control without sacrificing performance or reasoning
Researchers have developed new methods for steering large language model (LLM) behaviors at inference time without sacrificing generation quality. One approach, Prompt-only SV (PrOSV), intervenes only on prompt tokens, …
-
New methods enhance sparse autoencoder interpretability and stability
Researchers have developed new methods to address limitations in sparse autoencoders (SAEs), which are used to interpret the internal representations of large language models. One paper introduces adaptive elastic net S…