GPT-4.1
Benchmark scores
| Benchmark | Category | Rank | Score | Captured |
|---|---|---|---|---|
| METR Task Horizon (HCAST) | agents | #7 | 10m | 2025-07-12 |
| Aider Polyglot | code | #17 | 52.4% | 2025-04-14 |
| OpenRouter · Weekly Usage | usage | #19 | #41 | 2026-05-02 |
| SWE-bench Verified | agents | #26 | 39.6% | 2025-07-26 |
| Chatbot Arena | chat | #30 | 1382 | 2026-04-30 |