Introducing EVMbench

OpenAI Blog ·

OpenAI and Paradigm release EVMbench, a benchmark for evaluating AI agents on smart contract security tasks including detection, patching, and exploitation of vulnerabilities.

Categories: Research

Excerpt

OpenAI and Paradigm introduce EVMbench, a benchmark evaluating AI agents’ ability to detect, patch, and exploit high-severity smart contract vulnerabilities.