OpenAI launches EVMbench to test AI agents' smart contract security skills

By PulseAugur Editorial · Summary by gemini-2.5-flash-lite from 1 source

OpenAI has introduced EVMbench, a new benchmark designed to evaluate the capabilities of AI agents in detecting, patching, and exploiting vulnerabilities within smart contracts. This benchmark utilizes a curated set of 117 vulnerabilities from audits and aims to improve the security of blockchain environments, which handle over $100 billion in assets. Early results show that GPT-5.3-Codex achieved a 71.0% score in exploit mode, a significant improvement over previous models, though detection and patching capabilities still require further development. AI

Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →

RANK_REASON OpenAI released a new benchmark for evaluating AI agents on smart contract security, which is a research-oriented release.

Read on OpenAI News →

OpenAI launches EVMbench to test AI agents' smart contract security skills

COVERAGE [1]

OpenAI News TIER_1 · 2026-02-18 00:00

Introducing EVMbench

OpenAI and Paradigm introduce EVMbench, a benchmark evaluating AI agents’ ability to detect, patch, and exploit high-severity smart contract vulnerabilities.

COVERAGE [1]

Introducing EVMbench

RELATED TOPICS