o4-mini

openai · Ranked across 5 benchmarks · best rank #2

Benchmark scores

BenchmarkCategoryRankScoreCaptured
METR Task Horizon (HCAST) agents #2 1h18m 2025-07-12
Aider Polyglot code #6 72.0% 2025-04-16
SWE-bench Verified agents #22 45.0% 2025-07-26
Chatbot Arena chat #36 1353 2026-04-30
OpenRouter · Weekly Usage usage #42 #194 2026-05-02