WildGuard
PulseAugur coverage of WildGuard — every cluster mentioning WildGuard across labs, papers, and developer communities, ranked by signal.
2 天有情绪数据
-
GLiNER Guard unifies LLM safety and PII detection in single pass
A new system called GLiNER Guard (GLiGuard) has been developed to streamline safety moderation and PII detection for large language models. This unified encoder collapses multiple classifiers and NER models into a singl…
-
Fastino Labs open-sources GLiGuard safety model
Fastino Labs has released GLiGuard, an open-source safety moderation model designed to be significantly faster and more efficient than existing solutions. Unlike traditional decoder-only models that generate responses t…
-
AI safety models vulnerable to fine-tuning and embedding bypass attacks
Two new research papers explore vulnerabilities in AI safety mechanisms. The first paper, "When Safety Geometry Collapses," demonstrates how fine-tuning even benign guard models can inadvertently destroy their safety al…