PulseAugur
LIVE 13:01:34
ENTITY PhoneSafety

PhoneSafety

PulseAugur coverage of PhoneSafety — every cluster mentioning PhoneSafety across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
1
1 over 90d
TIER MIX · 90D
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 1 TOTAL
  1. TOOL · CL_25587 ·

    New benchmark separates AI safety from capability failures

    A new benchmark called PhoneSafety has been developed to better evaluate the safety of AI agents designed for phone use. Existing evaluations often fail to distinguish between an agent's deliberate safe action and its i…