ShieldGemma
PulseAugur coverage of ShieldGemma — every cluster mentioning ShieldGemma across labs, papers, and developer communities, ranked by signal.
3 day(s) with sentiment data
-
Encoder classifiers offer cost-effective LLM safety evaluation, study finds
A new research paper explores the effectiveness of encoder classifiers, specifically from the ModernBERT family, as a cost-efficient alternative to LLM-based judges for evaluating the safety of large language model outp…
-
New system detects distributional shift in AI safety classifiers
Researchers have developed a new online system designed to monitor distributional shift in deployed AI safety classifiers. This system uses sequential statistics to detect when a classifier's performance degrades due to…
-
AI safety judges trained with curriculum for improved rubric consistency
Researchers have developed a new training strategy for AI safety judges, aiming to improve their consistency and reliability. The strategy involves using dynamic rubrics generated from prompt-response-label triples to e…
-
GLiNER Guard unifies LLM safety and PII detection in single pass
A new system called GLiNER Guard (GLiGuard) has been developed to streamline safety moderation and PII detection for large language models. This unified encoder collapses multiple classifiers and NER models into a singl…
-
Fastino Labs open-sources GLiGuard safety model
Fastino Labs has released GLiGuard, an open-source safety moderation model designed to be significantly faster and more efficient than existing solutions. Unlike traditional decoder-only models that generate responses t…
-
Google releases Gemma 2 2B, ShieldGemma, and Gemma Scope
Google has announced updates to its Gemma family of models, including the release of Gemma 2 2B. This new iteration is designed for efficiency and accessibility, aiming to empower developers with powerful yet lightweigh…