PulseAugur
LIVE 13:10:24
research · [1 source] ·
0
research

OpenAI launches EVMbench to test AI agents' smart contract security skills

OpenAI has introduced EVMbench, a new benchmark designed to evaluate the capabilities of AI agents in detecting, patching, and exploiting vulnerabilities within smart contracts. This benchmark utilizes a curated set of 117 vulnerabilities from audits and aims to improve the security of blockchain environments, which handle over $100 billion in assets. Early results show that GPT-5.3-Codex achieved a 71.0% score in exploit mode, a significant improvement over previous models, though detection and patching capabilities still require further development. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON OpenAI released a new benchmark for evaluating AI agents on smart contract security, which is a research-oriented release.

Read on OpenAI News →

OpenAI launches EVMbench to test AI agents' smart contract security skills

COVERAGE [1]

  1. OpenAI News TIER_1 ·

    Introducing EVMbench

    OpenAI and Paradigm introduce EVMbench, a benchmark evaluating AI agents’ ability to detect, patch, and exploit high-severity smart contract vulnerabilities.