ENTITY LlamaGuard

LlamaGuard

PulseAugur coverage of LlamaGuard — every cluster mentioning LlamaGuard across labs, papers, and developer communities, ranked by signal.

Total · 30d

4

4 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

1

1 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

RESEARCH · CL_88573 · Jun 13 · 04:05

Google's AMS tool finds critical safety flaws in three tested LLMs

Google Cloud has open-sourced AMS (Activation Model Scanner), a tool that analyzes the geometric structure of a model's activation space to verify safety training. Unlike traditional behavioral tests, AMS directly inspe…
TOOL · CL_57758 · May 28 · 19:33

LLM Agents Vulnerable to Tool-Output Injection Attacks

LLM agents possess a significant security vulnerability where malicious code can be injected through the outputs of tools they utilize. This 'tool-output injection' bypasses standard input and output guardrails because …
RESEARCH · CL_16158 · May 5 · 04:00

AI safety models vulnerable to fine-tuning and embedding bypass attacks

Two new research papers explore vulnerabilities in AI safety mechanisms. The first paper, "When Safety Geometry Collapses," demonstrates how fine-tuning even benign guard models can inadvertently destroy their safety al…
TOOL · CL_09472 · Apr 29 · 20:03

New proxy tool blocks prompt injection attacks on AI models

A new tool called Arc Gate has been developed to act as a proxy, sitting in front of any OpenAI-compatible endpoint. This proxy is designed to effectively block prompt injection attacks before they can reach the underly…