Gemma-2-9B-it
PulseAugur coverage of Gemma-2-9B-it — every cluster mentioning Gemma-2-9B-it across labs, papers, and developer communities, ranked by signal.
-
New research audits LLM alignment shifts using effective rank
A new research paper introduces an "effective-rank" audit to analyze how alignment techniques alter the internal workings of large language models. The study examines three open-weight models: Llama-3.1-8B-Instruct, Gem…
-
New methods enhance LLM control without sacrificing performance or reasoning
Researchers have developed new methods for steering large language model (LLM) behaviors at inference time without sacrificing generation quality. One approach, Prompt-only SV (PrOSV), intervenes only on prompt tokens, …
-
New attack redirects LLM attention to bypass safety alignment
Researchers have developed a new white-box adversarial attack called the Attention Redistribution Attack (ARA) that targets the internal attention mechanisms of safety-aligned large language models. This attack crafts n…
-
New DPO methods enhance LLM alignment with adaptive techniques
Researchers have developed several advancements to Direct Preference Optimization (DPO), a method for aligning large language models (LLMs) with human preferences. AdaDPO introduces self-adaptive coefficients to balance…