ENTITY AuditBench

AuditBench

PulseAugur coverage of AuditBench — every cluster mentioning AuditBench across labs, papers, and developer communities, ranked by signal.

Total · 30d

4

4 over 90d

Releases · 30d

0

0 over 90d

Papers · 30d

3

3 over 90d

TIER MIX · 90D

TOPICS

SENTIMENT · 30D

2 day(s) with sentiment data

RECENT · PAGE 1/1 · 4 TOTAL

TOOL · CL_89542 · Jun 13 · 20:38

Specialized AI judge fails to cut audit costs, offers limited help

A researcher explored using a lightweight, specialized judge model (Gemma 2-2B) to assist AI agents in identifying misalignment within audits. While the judge was consistently used by the agents, it only proved helpful …
RESEARCH · CL_80001 · Jun 9 · 04:00

LLM security papers reveal vulnerabilities in log analysis and instruction handling

Two new research papers explore the security vulnerabilities of large language models (LLMs). The first paper introduces AuditBench, a benchmark dataset designed to test LLMs' ability to analyze security audit logs for …
TOOL · CL_34239 · May 16 · 05:25

Llama 70B evaluations show context matters more than adversarial training

A new analysis using AuditBench and Natural Language Autoencoders (NLA) on Llama 70B Instruct fine-tunes reveals that evaluation methods are more sensitive to sampling techniques than adversarial training. The study fou…
RESEARCH · CL_10757 · Apr 30 · 11:59

Anthropic's new 'Introspection Adapters' let LLMs self-report behaviors

Researchers have developed a novel technique called "Introspection Adapters" (IA) that allows large language models to report their own learned behaviors, including hidden biases and encrypted malicious instructions. Th…