PulseAugur
LIVE 07:17:17
ENTITY ExploitBench

ExploitBench

PulseAugur coverage of ExploitBench — every cluster mentioning ExploitBench across labs, papers, and developer communities, ranked by signal.

Total · 30d
1
1 over 90d
Releases · 30d
0
0 over 90d
Papers · 30d
0
0 over 90d
TIER MIX · 90D
SENTIMENT · 30D

1 day(s) with sentiment data

RECENT · PAGE 1/1 · 1 TOTAL
  1. RESEARCH · CL_35147 ·

    Claude Mythos tops GPT-5.5 on exploit benchmark, but at higher cost

    Anthropic's Claude Mythos model has achieved a score of 9.9 out of 16 on CMU's ExploitBench, significantly outperforming OpenAI's GPT-5.5, which scored 5.5. However, Claude Mythos is considerably more expensive to run, …