GPT-4.1

openai · Ranked across 5 benchmarks · best rank #7

Benchmark scores

BenchmarkCategoryRankScoreCaptured
METR Task Horizon (HCAST) agents #7 10m 2025-07-12
Aider Polyglot code #17 52.4% 2025-04-14
OpenRouter · Weekly Usage usage #19 #41 2026-05-02
SWE-bench Verified agents #26 39.6% 2025-07-26
Chatbot Arena chat #30 1382 2026-04-30