PulseAugur
LIVE 10:41:54
ENTITY Auditing Sabotage Bench

Auditing Sabotage Bench

PulseAugur coverage of Auditing Sabotage Bench — every cluster mentioning Auditing Sabotage Bench across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_07032 ·

    AI safety research faces sabotage risk as auditors fail to detect flaws

    Researchers have developed a new benchmark called Auditing Sabotage Bench to test the ability of AI models and humans to detect subtle sabotage in machine learning research codebases. The benchmark includes nine ML code…