Claude 3.5 Sonnet

anthropic · Ranked across 5 benchmarks · best rank #5

Benchmark scores

BenchmarkCategoryRankScoreCaptured
METR Task Horizon (HCAST) agents #5 28m 2025-07-12
Aider Polyglot code #18 51.6% 2025-01-17
SWE-bench Verified agents #18 62.8% 2025-02-28
Chatbot Arena chat #43 1297 2026-04-30
OpenRouter · Weekly Usage usage #52 #571 2026-05-02