Aider Polyglot

Category: code · Unit: % · Last refreshed

Code editing across Python/JS/Go/Rust/etc. via Aider's benchmark harness.

Top 25 models

RankModelScoreCaptured
1 GPT-5 88.0% 2025-08-23
2 o3 84.9% 2025-06-28
3 Gemini 2.5 Pro 83.1% 2025-06-06
4 Grok 4 79.6% 2025-07-11
5 DeepSeek V3 74.2% 2025-10-03
6 o4-mini 72.0% 2025-04-16
7 Claude Opus 4 72.0% 2025-05-25
8 DeepSeek R1 71.4% 2025-06-06
9 Claude 3.7 Sonnet 64.9% 2025-02-24
10 o1 61.7% 2024-12-21
11 Claude Sonnet 4 61.3% 2025-05-24
12 o3-mini 60.4% 2025-01-31
13 Qwen 3 235B 59.6% 2025-05-09
14 Kimi K2 59.1% 2025-07-17
15 Gemini 2.5 Flash 55.1% 2025-05-25
16 Grok 3 53.3% 2025-04-10
17 GPT-4.1 52.4% 2025-04-14
18 Claude 3.5 Sonnet 51.6% 2025-01-17
19 GPT-4o 45.3% 2025-03-29
20 GPT-4.5 44.9% 2025-02-27
21 Gemini 2.0 Pro 35.6% 2025-02-25
22 o1-mini 32.9% 2024-12-22
23 Claude 3.5 Haiku 28.0% 2024-12-21
24 Gemini 2.0 Flash 22.2% 2024-12-22
25 Llama 4 Maverick 15.6% 2025-04-06