ENTITY WildGuard

WildGuard

PulseAugur coverage of WildGuard — every cluster mentioning WildGuard across labs, papers, and developer communities, ranked by signal.

Show in brief

Total · 30d

5 over 90d

Releases · 30d

0 over 90d

Papers · 30d

3 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 5 TOTAL

RESEARCH · CL_88573 · Jun 13 · 04:05

Google's AMS tool finds critical safety flaws in three tested LLMs

Google Cloud has open-sourced AMS (Activation Model Scanner), a tool that analyzes the geometric structure of a model's activation space to verify safety training. Unlike traditional behavioral tests, AMS directly inspe…
TOOL · CL_58738 · May 29 · 04:00

New Opir models offer efficient multi-task safety classification for LLMs

Researchers have introduced Opir, a new family of encoder-based guardrail models designed for efficient multi-task safety classification in large language model applications. Opir models are built on the GLiClass archit…
TOOL · CL_38995 · May 19 · 12:01

GLiNER Guard unifies LLM safety and PII detection in single pass

A new system called GLiNER Guard (GLiGuard) has been developed to streamline safety moderation and PII detection for large language models. This unified encoder collapses multiple classifiers and NER models into a singl…
TOOL · CL_30372 · May 13 · 20:41

Fastino Labs open-sources GLiGuard safety model

Fastino Labs has released GLiGuard, an open-source safety moderation model designed to be significantly faster and more efficient than existing solutions. Unlike traditional decoder-only models that generate responses t…
RESEARCH · CL_16158 · May 5 · 04:00

AI safety models vulnerable to fine-tuning and embedding bypass attacks

Two new research papers explore vulnerabilities in AI safety mechanisms. The first paper, "When Safety Geometry Collapses," demonstrates how fine-tuning even benign guard models can inadvertently destroy their safety al…

Google's AMS tool finds critical safety flaws in three tested LLMs

New Opir models offer efficient multi-task safety classification for LLMs

GLiNER Guard unifies LLM safety and PII detection in single pass

Fastino Labs open-sources GLiGuard safety model

AI safety models vulnerable to fine-tuning and embedding bypass attacks