o3

openai · Ranked across 4 benchmarks · best rank #1

Benchmark scores

BenchmarkCategoryRankScoreCaptured
METR Task Horizon (HCAST) agents #1 1h58m 2025-07-12
Aider Polyglot code #2 84.9% 2025-06-28
Chatbot Arena chat #27 1409 2026-04-30
OpenRouter · Weekly Usage usage #44 #214 2026-05-02