2026 Global AI LLM Value Index Ranking
Data updated (UTC):
| Model | Vendor | Region | Intelligence index | Value | Input $/1M tokens | Output $/1M tokens | TTFT (s) | Token rate (tok/s) |
|---|---|---|---|---|---|---|---|---|
| Qwen3.5 9B (Reasoning) | Alibaba | 32.4 | 100.00 | 0.100 | 0.150 | 0.30 | 58.5 | |
| MiMo-V2-Flash (Feb 2026) | Xiaomi | 41.5 | 96.06 | 0.100 | 0.300 | 1.48 | 140.8 | |
| MiMo-V2-Flash (Reasoning) | Xiaomi | 39.2 | 90.74 | 0.100 | 0.300 | 1.40 | 139.9 | |
| gpt-oss-20B (high) | OpenAI | 24.5 | 89.54 | 0.060 | 0.200 | 0.45 | 313.1 | |
| Gemma 3n E4B Instruct | 6.4 | 88.89 | 0.020 | 0.040 | 0.33 | 46.6 | ||
| Step 3.5 Flash | StepFun | 37.8 | 87.50 | 0.100 | 0.300 | 1.33 | 83.4 | |
| Devstral Small (May '25) | Mistral | — | 18.0 | 83.33 | 0.060 | 0.120 | 0.00 | 0.0 |
| NVIDIA Nemotron 3 Nano 30B A3B (Reasoning) | NVIDIA | — | 24.3 | 80.35 | 0.060 | 0.240 | 1.19 | 139.0 |
| gpt-oss-20B (low) | OpenAI | 20.8 | 76.02 | 0.060 | 0.200 | 0.45 | 319.7 | |
| NVIDIA Nemotron Nano 9B V2 (Reasoning) | NVIDIA | — | 14.8 | 73.40 | 0.040 | 0.160 | 0.22 | 123.0 |
| MiMo-V2-Flash (Non-reasoning) | Xiaomi | 30.4 | 70.36 | 0.100 | 0.300 | 1.38 | 137.4 | |
| LFM2 24B A2B | Liquid AI | — | 10.5 | 69.43 | 0.030 | 0.120 | 0.23 | 191.7 |
| GLM-4.7-Flash (Reasoning) | Z AI | — | 30.1 | 68.52 | 0.070 | 0.400 | 0.66 | 88.7 |
| GPT-5 nano (high) | OpenAI | — | 26.8 | 67.67 | 0.050 | 0.400 | 63.33 | 159.1 |
| GPT-5 nano (medium) | OpenAI | — | 25.9 | 65.39 | 0.050 | 0.400 | 33.88 | 151.3 |
| Nova Micro | Amazon | — | 10.3 | 58.38 | 0.035 | 0.140 | 0.34 | 345.9 |
| NVIDIA Nemotron Nano 9B V2 (Non-reasoning) | NVIDIA | — | 13.2 | 53.12 | 0.050 | 0.195 | 0.56 | 134.6 |
| NVIDIA Nemotron 3 Nano 30B A3B (Non-reasoning) | NVIDIA | — | 13.2 | 52.36 | 0.050 | 0.200 | 0.26 | 80.8 |
| GLM-4.7-Flash (Non-reasoning) | Z AI | — | 22.1 | 50.30 | 0.070 | 0.400 | 0.71 | 90.9 |
| Grok 4.1 Fast (Reasoning) | xAI | — | 38.6 | 48.72 | 0.200 | 0.500 | 5.24 | 120.4 |
| Qwen2.5 Turbo | Alibaba | — | 12.0 | 47.60 | 0.050 | 0.200 | 1.00 | 64.4 |
| DeepSeek V3.2 (Reasoning) | DeepSeek | 41.7 | 45.95 | 0.280 | 0.420 | 1.24 | 35.8 | |
| Grok 4 Fast (Reasoning) | xAI | — | 35.1 | 44.30 | 0.200 | 0.500 | 4.13 | 143.3 |
| gpt-oss-120B (high) | OpenAI | 33.3 | 44.03 | 0.150 | 0.600 | 0.46 | 292.5 | |
| Gemini 2.5 Flash-Lite Preview (Sep '25) (Reasoning) | 21.6 | 42.84 | 0.100 | 0.400 | 4.79 | 359.4 | ||
| Nova Lite | Amazon | — | 12.7 | 41.98 | 0.060 | 0.240 | 0.38 | 219.4 |
| Llama 3.1 Instruct 8B | Meta | — | 11.8 | 40.95 | 0.100 | 0.100 | 0.44 | 204.5 |
| Llama 3.2 Instruct 3B | Meta | — | 9.7 | 39.60 | 0.085 | 0.085 | 0.43 | 53.7 |
| QwQ 32B-Preview | Alibaba | — | 15.2 | 39.07 | 0.120 | 0.180 | 0.47 | 60.1 |
| Ministral 3 3B | Mistral | — | 11.2 | 38.87 | 0.100 | 0.100 | 0.25 | 289.4 |
| Gemini 2.5 Flash-Lite Preview (Sep '25) (Non-reasoning) | 19.4 | 38.47 | 0.100 | 0.400 | 5.74 | 304.1 | ||
| Llama Nemotron Super 49B v1.5 (Reasoning) | NVIDIA | — | 18.7 | 37.08 | 0.100 | 0.400 | 0.22 | 85.6 |
| DeepSeek V3.2 Exp (Reasoning) | DeepSeek | — | 32.9 | 36.24 | 0.280 | 0.420 | 1.39 | 36.0 |
| Mistral Small 4 (Reasoning) | Mistral | — | 26.9 | 35.56 | 0.150 | 0.600 | 0.00 | 0.0 |
| DeepSeek V3.2 (Non-reasoning) | DeepSeek | 32.1 | 35.36 | 0.280 | 0.420 | 1.27 | 36.1 | |
| Devstral Small (Jul '25) | Mistral | — | 15.2 | 35.16 | 0.100 | 0.300 | 0.34 | 213.4 |
| Mistral Small 3.2 | Mistral | — | 15.1 | 34.93 | 0.100 | 0.300 | 0.29 | 197.3 |
| Gemini 2.5 Flash-Lite (Reasoning) | — | 17.6 | 34.90 | 0.100 | 0.400 | 20.28 | 318.5 | |
| Granite 4.0 H Small | IBM | — | 10.8 | 34.86 | 0.060 | 0.250 | 8.69 | 526.1 |
| GPT-5 nano (minimal) | OpenAI | — | 13.8 | 34.83 | 0.050 | 0.400 | 0.70 | 159.1 |
| Ministral 3 8B | Mistral | — | 14.8 | 34.24 | 0.150 | 0.150 | 0.29 | 174.2 |
| Llama 2 Chat 7B | Meta | — | 9.7 | 33.66 | 0.050 | 0.250 | 0.52 | 124.2 |
| Mistral Small 3.1 | Mistral | — | 14.5 | 33.54 | 0.100 | 0.300 | 0.42 | 154.3 |
| GPT-5.4 nano (xhigh) | OpenAI | 44.4 | 33.31 | 0.200 | 1.250 | 2.28 | 215.7 | |
| MiniMax-M2.7 | MiniMax | — | 49.6 | 32.78 | 0.300 | 1.200 | 1.55 | 43.5 |
| gpt-oss-120B (low) | OpenAI | 24.5 | 32.38 | 0.150 | 0.600 | 0.48 | 300.4 | |
| Grok 3 mini Reasoning (high) | xAI | — | 32.1 | 31.82 | 0.300 | 0.500 | 0.37 | 198.3 |
| Llama 3 Instruct 8B | Meta | — | 6.4 | 31.72 | 0.045 | 0.145 | 0.36 | 90.2 |
| DeepSeek V3.2 Exp (Non-reasoning) | DeepSeek | — | 28.4 | 31.28 | 0.280 | 0.420 | 1.27 | 35.9 |
| Mercury 2 | Inception | — | 32.8 | 30.35 | 0.250 | 0.750 | 4.04 | 834.7 |
| NVIDIA Nemotron 3 Super 120B A12B (Reasoning) | NVIDIA | — | 36.0 | 30.28 | 0.300 | 0.750 | 0.59 | 411.2 |
| Grok 4.1 Fast (Non-reasoning) | xAI | — | 23.6 | 29.77 | 0.200 | 0.500 | 0.31 | 121.1 |
| Mistral Small 3 | Mistral | — | 12.7 | 29.37 | 0.100 | 0.300 | 0.40 | 149.9 |
| Grok 4 Fast (Non-reasoning) | xAI | — | 23.1 | 29.14 | 0.200 | 0.500 | 0.28 | 104.8 |
| Seed-OSS-36B-Instruct | ByteDance Seed | — | 25.2 | 29.14 | 0.210 | 0.570 | 1.67 | 45.5 |
| Llama Nemotron Super 49B v1.5 (Non-reasoning) | NVIDIA | — | 14.6 | 28.94 | 0.100 | 0.400 | 0.22 | 88.1 |
| GPT-5.4 nano (medium) | OpenAI | 38.1 | 28.58 | 0.200 | 1.250 | 1.89 | 217.1 | |
| Granite 3.3 8B (Non-reasoning) | IBM | — | 7.0 | 28.57 | 0.030 | 0.250 | 7.12 | 549.7 |
| Hermes 4 - Llama-3.1 70B (Reasoning) | Nous Research | — | 16.0 | 28.10 | 0.130 | 0.400 | 0.56 | 84.1 |
| Ministral 3 14B | Mistral | — | 16.0 | 27.75 | 0.200 | 0.200 | 0.28 | 132.3 |
| MiniMax-M2.5 | MiniMax | — | 41.9 | 27.69 | 0.300 | 1.200 | 3.46 | 47.2 |
| Solar Mini | Upstage | — | 11.9 | 27.52 | 0.150 | 0.150 | 1.46 | 95.7 |
| MiniMax-M2.1 | MiniMax | — | 39.4 | 26.03 | 0.300 | 1.200 | 2.02 | 42.2 |
| GPT-4.1 nano | OpenAI | — | 13.0 | 25.77 | 0.100 | 0.400 | 0.38 | 150.1 |
| Gemini 2.5 Flash-Lite (Non-reasoning) | — | 12.7 | 25.17 | 0.100 | 0.400 | 0.29 | 293.4 | |
| Mistral Small 4 (Non-reasoning) | Mistral | — | 18.6 | 24.58 | 0.150 | 0.600 | 0.40 | 155.1 |
| Gemini 2.0 Flash (Feb '25) | — | 18.5 | 24.45 | 0.150 | 0.600 | 0.00 | 0.0 | |
| MiniMax-M2 | MiniMax | — | 36.1 | 23.85 | 0.300 | 1.200 | 2.63 | 42.9 |
| KAT-Coder-Pro V1 | KwaiKAT | — | 36.0 | 23.78 | 0.300 | 1.200 | 1.59 | 60.4 |
| Qwen3 4B (Non-reasoning) | Alibaba | — | 12.5 | 23.12 | 0.110 | 0.420 | 0.95 | 105.1 |
| Olmo 3 7B Instruct | Allen Institute for AI | — | 8.2 | 22.75 | 0.100 | 0.200 | 0.43 | 87.3 |
| Hermes 4 - Llama-3.1 70B (Non-reasoning) | Nous Research | — | 12.6 | 22.13 | 0.130 | 0.400 | 0.58 | 81.7 |
| DeepSeek R1 Distill Qwen 32B | DeepSeek | — | 17.2 | 22.09 | 0.270 | 0.270 | 0.49 | 59.4 |
| Ling-flash-2.0 | InclusionAI | — | 15.7 | 22.00 | 0.140 | 0.570 | 1.46 | 57.7 |
| Llama 3.2 Instruct 1B | Meta | — | 6.3 | 21.85 | 0.100 | 0.100 | 0.65 | 232.1 |
| GPT-5 mini (high) | OpenAI | — | 41.2 | 20.78 | 0.250 | 2.000 | 71.09 | 89.9 |
| Gemini 3.1 Flash-Lite Preview | 33.5 | 20.65 | 0.250 | 1.500 | 5.29 | 229.7 | ||
| GPT-5 mini (medium) | OpenAI | — | 38.9 | 19.62 | 0.250 | 2.000 | 14.73 | 81.5 |
| Ring-flash-2.0 | InclusionAI | — | 14.0 | 19.61 | 0.140 | 0.570 | 1.92 | 92.7 |
| GPT-5.1 Codex mini (high) | OpenAI | — | 38.6 | 19.47 | 0.250 | 2.000 | 2.05 | 205.5 |
| Grok Code Fast 1 | xAI | — | 28.7 | 18.95 | 0.200 | 1.500 | 3.10 | 165.9 |
| GLM-4.5-Air | Z AI | — | 23.2 | 18.93 | 0.200 | 1.100 | 0.64 | 101.0 |
| Llama 3.2 Instruct 11B (Vision) | Meta | — | 8.7 | 18.85 | 0.160 | 0.160 | 0.41 | 85.8 |
| Qwen3.5 35B A3B (Reasoning) | Alibaba | 37.1 | 18.71 | 0.250 | 2.000 | 1.02 | 122.0 | |
| GPT-5.4 nano (Non-Reasoning) | OpenAI | 24.4 | 18.29 | 0.200 | 1.250 | 0.41 | 217.5 | |
| GLM-4.6V (Reasoning) | Z AI | — | 23.4 | 18.03 | 0.300 | 0.900 | 0.86 | 28.9 |
| Qwen3.5 27B (Reasoning) | Alibaba | — | 42.1 | 17.69 | 0.300 | 2.400 | 1.30 | 89.8 |
| Qwen3 Coder Next | Alibaba | — | 28.3 | 17.44 | 0.350 | 1.200 | 0.77 | 165.4 |
| NVIDIA Nemotron Nano 12B v2 VL (Reasoning) | NVIDIA | — | 14.9 | 17.22 | 0.200 | 0.600 | 0.46 | 133.0 |
| GPT-4o mini | OpenAI | — | 12.6 | 16.64 | 0.150 | 0.600 | 0.49 | 57.3 |
| Phi-4 | Microsoft Azure | — | 10.4 | 16.48 | 0.125 | 0.500 | 0.43 | 30.7 |
| Apertus 8B Instruct | Swiss AI Initiative | — | 5.9 | 16.36 | 0.100 | 0.200 | 3.69 | 7.1 |
| Llama 4 Scout | Meta | — | 13.5 | 16.00 | 0.170 | 0.660 | 0.45 | 131.2 |
| Qwen3 VL 8B Instruct | Alibaba | — | 14.3 | 15.99 | 0.180 | 0.700 | 1.01 | 140.4 |
| Qwen3 VL 30B A3B Instruct | Alibaba | — | 16.1 | 15.94 | 0.200 | 0.800 | 1.11 | 123.2 |
| DeepSeek V3.1 Terminus (Non-reasoning) | DeepSeek | — | 28.5 | 15.77 | 0.335 | 1.500 | 0.00 | 0.0 |
| Qwen3.5 27B (Non-reasoning) | Alibaba | — | 37.2 | 15.63 | 0.300 | 2.400 | 1.30 | 89.7 |
| Qwen3.5 35B A3B (Non-reasoning) | Alibaba | — | 30.7 | 15.48 | 0.250 | 2.000 | 1.04 | 131.0 |
| Llama 4 Maverick | Meta | — | 18.4 | 15.37 | 0.270 | 0.850 | 0.47 | 132.9 |
| Qwen3 30B A3B 2507 Instruct | Alibaba | — | 15.0 | 14.85 | 0.200 | 0.800 | 1.07 | 69.5 |
| DeepSeek V3.1 Terminus (Reasoning) | DeepSeek | — | 33.9 | 14.68 | 0.400 | 2.000 | 0.00 | 0.0 |
| GLM-4.7 (Reasoning) | Z AI | — | 42.1 | 14.59 | 0.600 | 2.200 | 0.69 | 91.4 |
| Gemini 3 Flash Preview (Reasoning) | 46.4 | 14.29 | 0.500 | 3.000 | 4.87 | 185.1 | ||
| Olmo 3.1 32B Instruct | Allen Institute for AI | — | 12.2 | 14.09 | 0.200 | 0.600 | 0.48 | 53.7 |
| Kimi K2.5 (Reasoning) | Kimi | — | 46.8 | 13.51 | 0.600 | 3.000 | 1.35 | 33.9 |
| Kimi K2 Thinking | Kimi | — | 40.9 | 13.18 | 0.600 | 2.500 | 0.66 | 98.5 |
| GLM-4.6V (Non-reasoning) | Z AI | — | 17.1 | 13.16 | 0.300 | 0.900 | 5.00 | 22.5 |
| Qwen3.5 122B A10B (Reasoning) | Alibaba | — | 41.6 | 13.10 | 0.400 | 3.200 | 0.98 | 157.5 |
| Qwen3 Omni 30B A3B (Reasoning) | Alibaba | — | 15.6 | 12.57 | 0.250 | 0.970 | 0.92 | 105.0 |
| Qwen3 1.7B (Non-reasoning) | Alibaba | — | 6.8 | 12.56 | 0.110 | 0.420 | 0.87 | 139.7 |
| GLM-4.7 (Non-reasoning) | Z AI | — | 34.2 | 12.47 | 0.550 | 2.150 | 0.65 | 90.2 |
| Qwen3 4B (Reasoning) | Alibaba | — | 14.2 | 12.37 | 0.110 | 1.260 | 0.93 | 104.2 |
| Qwen3 30B A3B (Non-reasoning) | Alibaba | — | 12.5 | 12.37 | 0.200 | 0.800 | 1.06 | 63.4 |
| Hermes 3 - Llama-3.1 70B | Nous Research | — | 10.6 | 12.24 | 0.300 | 0.300 | 0.28 | 41.1 |
| Nova 2.0 Lite (medium) | Amazon | — | 29.7 | 12.10 | 0.300 | 2.500 | 9.62 | 206.3 |
| Reka Flash (Sep '24) | Reka AI | — | 12.0 | 11.87 | 0.200 | 0.800 | 1.23 | 84.9 |
| Qwen3 8B (Non-reasoning) | Alibaba | — | 10.6 | 11.84 | 0.180 | 0.700 | 0.92 | 73.5 |
| Mistral Small (Sep '24) | Mistral | — | 10.2 | 11.78 | 0.200 | 0.600 | 0.40 | 150.3 |
| NVIDIA Nemotron Nano 12B v2 VL (Non-reasoning) | NVIDIA | — | 10.1 | 11.66 | 0.200 | 0.600 | 0.55 | 138.0 |
| DeepSeek V3.1 (Non-reasoning) | DeepSeek | — | 28.1 | 11.59 | 0.560 | 1.680 | 0.00 | 0.0 |
| Qwen3.5 397B A17B (Reasoning) | Alibaba | — | 45.0 | 11.54 | 0.600 | 3.600 | 1.32 | 51.7 |
| GLM-4.6 (Reasoning) | Z AI | 32.5 | 11.47 | 0.575 | 2.200 | 0.77 | 94.0 | |
| Nova 2.0 Omni (medium) | Amazon | — | 28.0 | 11.41 | 0.300 | 2.500 | 0.00 | 0.0 |
| GPT-4.1 mini | OpenAI | — | 22.9 | 11.33 | 0.400 | 1.600 | 0.46 | 76.8 |
| Qwen3.5 122B A10B (Non-reasoning) | Alibaba | — | 35.9 | 11.30 | 0.400 | 3.200 | 1.02 | 156.1 |
| GLM-5 (Reasoning) | Z AI | — | 49.8 | 11.13 | 1.000 | 3.200 | 0.73 | 88.6 |
| Jamba 1.5 Mini | AI21 Labs | — | 8.0 | 11.08 | 0.200 | 0.400 | 0.00 | 0.0 |
| Gemini 2.5 Flash (Reasoning) | — | 27.0 | 11.00 | 0.300 | 2.500 | 10.90 | 232.1 | |
| DeepSeek V3.1 (Reasoning) | DeepSeek | — | 27.7 | 10.96 | 0.600 | 1.700 | 0.00 | 0.0 |
| Jamba 1.6 Mini | AI21 Labs | — | 7.9 | 10.94 | 0.200 | 0.400 | 0.64 | 184.1 |
| DeepSeek V3 (Dec '24) | DeepSeek | — | 16.5 | 10.93 | 0.400 | 0.890 | 0.00 | 0.0 |
| GLM-4.5 (Reasoning) | Z AI | — | 26.4 | 10.85 | 0.490 | 1.900 | 0.85 | 35.5 |
| Gemini 3 Flash Preview (Non-reasoning) | 35.0 | 10.77 | 0.500 | 3.000 | 0.65 | 191.5 | ||
| Kimi K2.5 (Non-reasoning) | Kimi | — | 37.3 | 10.76 | 0.600 | 3.000 | 1.37 | 34.7 |
| ERNIE 4.5 300B A47B | Baidu | — | 15.0 | 10.71 | 0.280 | 1.100 | 1.95 | 32.2 |
| Mistral Large 3 | Mistral | — | 22.8 | 10.53 | 0.500 | 1.500 | 0.50 | 58.7 |
| Qwen3 0.6B (Non-reasoning) | Alibaba | — | 5.7 | 10.53 | 0.110 | 0.420 | 0.96 | 224.5 |
| GLM-4.6 (Non-reasoning) | Z AI | — | 30.2 | 10.46 | 0.600 | 2.200 | 1.28 | 103.1 |
| GPT-5 mini (minimal) | OpenAI | — | 20.7 | 10.42 | 0.250 | 2.000 | 0.86 | 92.5 |
| Qwen3 30B A3B 2507 (Reasoning) | Alibaba | — | 22.4 | 10.34 | 0.200 | 2.400 | 0.97 | 145.6 |
| Qwen3.5 397B A17B (Non-reasoning) | Alibaba | 40.1 | 10.28 | 0.600 | 3.600 | 1.51 | 49.7 | |
| Mistral 7B Instruct | Mistral | — | 7.4 | 10.25 | 0.250 | 0.250 | 0.27 | 169.1 |
| Nova 2.0 Lite (low) | Amazon | — | 24.6 | 10.02 | 0.300 | 2.500 | 3.85 | 218.9 |
| GPT-5.4 mini (xhigh) | OpenAI | 48.1 | 9.87 | 0.750 | 4.500 | 2.64 | 230.3 | |
| Nova 2.0 Omni (low) | Amazon | — | 23.2 | 9.45 | 0.300 | 2.500 | 0.00 | 0.0 |
| Reka Flash 3 | Reka AI | — | 9.5 | 9.39 | 0.200 | 0.800 | 1.28 | 55.5 |
| Mistral Medium 3.1 | Mistral | — | 21.3 | 9.21 | 0.400 | 2.000 | 0.37 | 87.3 |
| Kimi K2 0905 | Kimi | — | 30.9 | 9.20 | 0.800 | 2.250 | 0.81 | 54.0 |
| QwQ 32B | Alibaba | — | 19.7 | 9.15 | 0.660 | 1.000 | 0.42 | 33.0 |
| Qwen3 VL 30B A3B (Reasoning) | Alibaba | — | 19.7 | 9.09 | 0.200 | 2.400 | 1.00 | 128.6 |
| GLM-5 (Non-reasoning) | Z AI | — | 40.6 | 9.06 | 1.000 | 3.200 | 0.95 | 72.3 |
| Kimi K2 | Kimi | — | 26.3 | 8.86 | 0.570 | 2.400 | 0.97 | 52.3 |
| MiniMax M1 80k | MiniMax | — | 24.4 | 8.77 | 0.550 | 2.200 | 0.00 | 0.0 |
| Qwen3 VL 8B (Reasoning) | Alibaba | — | 16.7 | 8.75 | 0.180 | 2.100 | 1.01 | 137.5 |
| Qwen3 Omni 30B A3B Instruct | Alibaba | — | 10.7 | 8.61 | 0.250 | 0.970 | 0.88 | 107.9 |
| Claude 3 Haiku | Anthropic | — | 12.3 | 8.51 | 0.250 | 1.250 | 0.42 | 131.7 |
| Magistral Small 1.2 | Mistral | — | 18.2 | 8.39 | 0.500 | 1.500 | 0.34 | 131.0 |
| Gemini 2.5 Flash (Non-reasoning) | — | 20.6 | 8.38 | 0.300 | 2.500 | 0.41 | 212.0 | |
| Llama 3.3 Instruct 70B | Meta | — | 14.5 | 8.19 | 0.580 | 0.710 | 0.50 | 94.9 |
| Mistral Medium 3 | Mistral | — | 18.8 | 8.13 | 0.400 | 2.000 | 0.38 | 49.5 |
| Devstral Medium | Mistral | — | 18.7 | 8.08 | 0.400 | 2.000 | 0.39 | 138.8 |
| Qwen3 Next 80B A3B Instruct | Alibaba | 20.1 | 7.94 | 0.500 | 2.000 | 0.97 | 174.2 | |
| GPT-5.4 mini (medium) | OpenAI | 37.7 | 7.73 | 0.750 | 4.500 | 2.55 | 231.6 | |
| Llama 3.1 Instruct 70B | Meta | — | 12.5 | 7.72 | 0.560 | 0.560 | 0.43 | 41.9 |
| Qwen3 Coder 30B A3B Instruct | Alibaba | — | 20.0 | 7.68 | 0.450 | 2.250 | 1.43 | 26.4 |
| Nova 2.0 Lite (Non-reasoning) | Amazon | — | 18.0 | 7.32 | 0.300 | 2.500 | 0.53 | 175.2 |
| Qwen3 14B (Non-reasoning) | Alibaba | — | 12.8 | 7.22 | 0.350 | 1.400 | 0.99 | 65.8 |
| Qwen3 235B A22B 2507 Instruct | Alibaba | — | 25.0 | 7.05 | 0.700 | 2.800 | 1.06 | 64.7 |
| Qwen3 30B A3B (Reasoning) | Alibaba | — | 15.3 | 7.05 | 0.200 | 2.400 | 1.15 | 62.2 |
| DeepSeek R1 Distill Llama 70B | DeepSeek | 16.0 | 7.02 | 0.700 | 1.050 | 0.79 | 62.7 | |
| Qwen3 1.7B (Reasoning) | Alibaba | — | 8.0 | 6.96 | 0.110 | 1.260 | 0.91 | 139.2 |
| Qwen3 8B (Reasoning) | Alibaba | — | 13.2 | 6.91 | 0.180 | 2.100 | 1.01 | 59.6 |
| Nova 2.0 Omni (Non-reasoning) | Amazon | — | 16.6 | 6.75 | 0.300 | 2.500 | 0.57 | 207.0 |
| Claude 4.5 Haiku (Reasoning) | Anthropic | — | 37.1 | 6.41 | 1.000 | 5.000 | 9.75 | 131.9 |
| o4-mini (high) | OpenAI | — | 33.1 | 5.94 | 1.100 | 4.400 | 17.25 | 148.3 |
| DeepSeek V3 0324 | DeepSeek | — | 22.3 | 5.92 | 1.250 | 1.450 | 0.00 | 0.0 |
| Qwen3 VL 235B A22B Instruct | Alibaba | — | 20.8 | 5.86 | 0.700 | 2.800 | 1.14 | 60.7 |
| GLM-4.5V (Reasoning) | Z AI | — | 15.1 | 5.79 | 0.600 | 1.800 | 1.00 | 57.5 |
| Llama 3.1 Nemotron Ultra 253B v1 (Reasoning) | NVIDIA | — | 15.0 | 5.75 | 0.600 | 1.800 | 0.68 | 42.2 |
| Qwen3 Max Thinking | Alibaba | — | 39.9 | 5.74 | 1.200 | 6.000 | 1.59 | 32.5 |
| Llama 3.2 Instruct 90B (Vision) | Meta | — | 11.9 | 5.71 | 0.720 | 0.720 | 0.39 | 42.1 |
| Qwen3 0.6B (Reasoning) | Alibaba | — | 6.5 | 5.65 | 0.110 | 1.260 | 0.86 | 224.3 |
| Grok 4.20 Beta 0309 (Reasoning) | xAI | — | 48.5 | 5.58 | 2.000 | 6.000 | 22.42 | 112.9 |
| Claude 4.5 Haiku (Non-reasoning) | Anthropic | — | 31.1 | 5.37 | 1.000 | 5.000 | 0.46 | 94.0 |
| Sonar | Perplexity | — | 15.5 | 5.35 | 1.000 | 1.000 | 1.14 | 126.1 |
| Qwen3 Next 80B A3B (Reasoning) | Alibaba | 26.7 | 4.91 | 0.500 | 6.000 | 1.02 | 160.7 | |
| GLM-4.5V (Non-reasoning) | Z AI | — | 12.7 | 4.87 | 0.600 | 1.800 | 22.15 | 64.3 |
| Qwen3 VL 32B Instruct | Alibaba | — | 17.2 | 4.84 | 0.700 | 2.800 | 1.03 | 76.5 |
| Qwen3 235B A22B (Non-reasoning) | Alibaba | — | 17.0 | 4.79 | 0.700 | 2.800 | 1.10 | 63.2 |
| GPT-5.1 (high) | OpenAI | — | 47.7 | 4.79 | 1.250 | 10.000 | 14.08 | 123.6 |
| Mixtral 8x7B Instruct | Mistral | — | 7.7 | 4.78 | 0.540 | 0.600 | 0.00 | 0.0 |
| GPT-5.4 mini (Non-Reasoning) | OpenAI | 23.3 | 4.76 | 0.750 | 4.500 | 0.40 | 210.3 | |
| Qwen3 Max Thinking (Preview) | Alibaba | — | 32.5 | 4.67 | 1.200 | 6.000 | 1.69 | 44.2 |
| o3-mini | OpenAI | — | 25.9 | 4.64 | 1.100 | 4.400 | 6.93 | 154.8 |
| o3-mini (high) | OpenAI | — | 25.2 | 4.51 | 1.100 | 4.400 | 21.11 | 151.8 |
| Qwen3 Max | Alibaba | — | 31.4 | 4.51 | 1.200 | 6.000 | 1.75 | 32.8 |
| GPT-5 Codex (high) | OpenAI | — | 44.6 | 4.47 | 1.250 | 10.000 | 5.25 | 171.8 |
| GPT-5 (high) | OpenAI | — | 44.6 | 4.47 | 1.250 | 10.000 | 77.71 | 67.1 |
| Gemini 3.1 Pro Preview | — | 57.2 | 4.38 | 2.000 | 12.000 | 26.66 | 120.0 | |
| GPT-5.1 Codex (high) | OpenAI | — | 43.1 | 4.32 | 1.250 | 10.000 | 3.94 | 127.0 |
| Hermes 4 - Llama-3.1 405B (Reasoning) | Nous Research | — | 18.6 | 4.27 | 1.000 | 3.000 | 0.79 | 32.2 |
| Qwen3 14B (Reasoning) | Alibaba | — | 16.2 | 4.25 | 0.350 | 4.200 | 1.00 | 65.3 |
| GPT-5 (medium) | OpenAI | — | 42.0 | 4.21 | 1.250 | 10.000 | 40.09 | 55.0 |
| GPT-3.5 Turbo | OpenAI | — | 9.0 | 4.13 | 0.500 | 1.500 | 0.47 | 106.6 |
| Qwen3 32B (Non-reasoning) | Alibaba | 14.5 | 4.08 | 0.700 | 2.800 | 0.98 | 101.1 | |
| Hermes 4 - Llama-3.1 405B (Non-reasoning) | Nous Research | — | 17.6 | 4.04 | 1.000 | 3.000 | 0.72 | 31.4 |
| Claude 3.5 Haiku | Anthropic | — | 18.7 | 4.03 | 0.800 | 4.000 | 0.00 | 0.0 |
| DeepSeek R1 0528 (May '25) | DeepSeek | 27.1 | 3.95 | 1.350 | 5.400 | 0.00 | 0.0 | |
| GPT-5 (low) | OpenAI | — | 39.2 | 3.93 | 1.250 | 10.000 | 11.79 | 48.8 |
| Qwen3 235B A22B 2507 (Reasoning) | Alibaba | — | 29.5 | 3.87 | 0.700 | 8.400 | 1.31 | 42.4 |
| GPT-5.3 Codex (xhigh) | OpenAI | 54.0 | 3.86 | 1.750 | 14.000 | 56.74 | 67.5 | |
| Llama 3.1 Nemotron Instruct 70B | NVIDIA | — | 13.4 | 3.84 | 1.200 | 1.200 | 0.54 | 34.8 |
| o3 | OpenAI | — | 38.4 | 3.78 | 2.000 | 8.000 | 7.97 | 73.7 |
| Qwen3 Max (Preview) | Alibaba | — | 26.1 | 3.74 | 1.200 | 6.000 | 1.68 | 45.1 |
| Gemini 3 Pro Preview (high) | — | 48.4 | 3.70 | 2.000 | 12.000 | 24.38 | 119.4 | |
| GPT-5.2 (xhigh) | OpenAI | — | 51.3 | 3.67 | 1.750 | 14.000 | 28.35 | 74.7 |
| Qwen3 VL 235B A22B (Reasoning) | Alibaba | 27.6 | 3.62 | 0.700 | 8.400 | 1.10 | 53.8 | |
| Nova 2.0 Pro Preview (medium) | Amazon | — | 35.7 | 3.57 | 1.250 | 10.000 | 12.71 | 138.1 |
| Llama 3 Instruct 70B | Meta | — | 8.9 | 3.51 | 0.580 | 1.745 | 0.45 | 46.9 |
| GPT-5.2 Codex (xhigh) | OpenAI | — | 49.0 | 3.50 | 1.750 | 14.000 | 1.54 | 115.3 |
| GPT-5.4 (xhigh) | OpenAI | 57.2 | 3.50 | 2.500 | 15.000 | 131.33 | 84.0 | |
| Gemini 2.5 Pro | 34.6 | 3.46 | 1.250 | 10.000 | 21.05 | 124.1 | ||
| Grok 4.20 Beta 0309 (Non-reasoning) | xAI | — | 29.7 | 3.40 | 2.000 | 6.000 | 0.36 | 90.6 |
| Command-R (Mar '24) | Cohere | — | 7.4 | 3.39 | 0.500 | 1.500 | 0.00 | 0.0 |
| Magistral Medium 1.2 | Mistral | — | 27.1 | 3.39 | 2.000 | 5.000 | 0.42 | 88.6 |
| GPT-5.2 (medium) | OpenAI | — | 46.6 | 3.33 | 1.750 | 14.000 | 0.00 | 0.0 |
| Nova Pro | Amazon | — | 13.5 | 3.32 | 0.800 | 3.200 | 0.00 | 0.0 |
| Qwen3 VL 32B (Reasoning) | Alibaba | — | 24.7 | 3.23 | 0.700 | 8.400 | 1.17 | 97.2 |
| DeepSeek R1 (Jan '25) | DeepSeek | — | 18.8 | 3.21 | 1.350 | 4.000 | 0.00 | 0.0 |
| Nova 2.0 Pro Preview (low) | Amazon | — | 31.9 | 3.19 | 1.250 | 10.000 | 3.23 | 161.5 |
| Gemini 3 Pro Preview (low) | 41.3 | 3.15 | 2.000 | 12.000 | 3.91 | 115.1 | ||
| Claude Sonnet 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | — | 51.7 | 2.96 | 3.000 | 15.000 | 49.35 | 64.9 |
| Gemini 2.5 Pro Preview (May' 25) | — | 29.5 | 2.95 | 1.250 | 10.000 | 0.00 | 0.0 | |
| Qwen3 Coder 480B A35B Instruct | Alibaba | 24.8 | 2.84 | 1.500 | 7.500 | 1.57 | 68.4 | |
| GPT-5.1 (Non-reasoning) | OpenAI | — | 27.4 | 2.73 | 1.250 | 10.000 | 0.64 | 125.5 |
| Qwen3 235B A22B (Reasoning) | Alibaba | — | 19.8 | 2.59 | 0.700 | 8.400 | 1.15 | 62.5 |
| GPT-4.1 | OpenAI | — | 26.3 | 2.58 | 2.000 | 8.000 | 0.53 | 96.4 |
| Claude Sonnet 4.6 (Non-reasoning, High Effort) | Anthropic | — | 44.4 | 2.54 | 3.000 | 15.000 | 1.07 | 47.2 |
| Claude 4.5 Sonnet (Reasoning) | Anthropic | — | 43.0 | 2.46 | 3.000 | 15.000 | 7.67 | 44.7 |
| Claude Sonnet 4.6 (Non-reasoning, Low Effort) | Anthropic | — | 42.6 | 2.43 | 3.000 | 15.000 | 1.16 | 46.2 |
| GPT-5.2 (Non-reasoning) | OpenAI | — | 33.6 | 2.39 | 1.750 | 14.000 | 0.60 | 66.3 |
| GPT-5 (minimal) | OpenAI | — | 23.9 | 2.38 | 1.250 | 10.000 | 0.96 | 54.2 |
| Grok 4 | xAI | — | 41.5 | 2.37 | 3.000 | 15.000 | 9.16 | 43.4 |
| Nova 2.0 Pro Preview (Non-reasoning) | Amazon | — | 23.1 | 2.30 | 1.250 | 10.000 | 0.47 | 166.3 |
| Claude 4 Sonnet (Reasoning) | Anthropic | — | 38.7 | 2.21 | 3.000 | 15.000 | 7.91 | 46.6 |
| GPT-5 (ChatGPT) | OpenAI | — | 21.8 | 2.17 | 1.250 | 10.000 | 0.62 | 114.3 |
| GPT-5.4 (Non-reasoning) | OpenAI | 35.4 | 2.15 | 2.500 | 15.000 | 0.67 | 62.6 | |
| Qwen3 32B (Reasoning) | Alibaba | — | 16.5 | 2.15 | 0.700 | 8.400 | 1.00 | 100.0 |
| Claude 4.5 Sonnet (Non-reasoning) | Anthropic | — | 37.1 | 2.11 | 3.000 | 15.000 | 0.81 | 41.3 |
| Mistral Small (Feb '24) | Mistral | — | 9.0 | 2.05 | 1.000 | 3.000 | 0.40 | 151.0 |
| Qwen2.5 Max | Alibaba | — | 16.3 | 1.99 | 1.600 | 6.400 | 1.12 | 50.0 |
| Claude 3.7 Sonnet (Reasoning) | Anthropic | — | 34.7 | 1.97 | 3.000 | 15.000 | 0.00 | 0.0 |
| Apertus 70B Instruct | Swiss AI Initiative | — | 7.7 | 1.95 | 0.820 | 2.920 | 1.67 | 63.5 |
| Claude 4 Sonnet (Non-reasoning) | Anthropic | — | 33.0 | 1.88 | 3.000 | 15.000 | 1.11 | 45.3 |
| Claude Opus 4.6 (Adaptive Reasoning, Max Effort) | Anthropic | — | 53.0 | 1.81 | 5.000 | 25.000 | 8.44 | 50.5 |
| Claude 3.7 Sonnet (Non-reasoning) | Anthropic | — | 30.8 | 1.75 | 3.000 | 15.000 | 0.00 | 0.0 |
| Mistral Large 2 (Nov '24) | Mistral | — | 15.1 | 1.71 | 2.000 | 6.000 | 0.45 | 44.6 |
| Claude Opus 4.5 (Reasoning) | Anthropic | — | 49.7 | 1.69 | 5.000 | 25.000 | 10.86 | 56.8 |
| Llama 3.1 Instruct 405B | Meta | — | 17.4 | 1.60 | 2.750 | 6.500 | 0.54 | 38.0 |
| Pixtral Large | Mistral | — | 14.0 | 1.59 | 2.000 | 6.000 | 0.42 | 58.7 |
| Claude Opus 4.6 (Non-reasoning, High Effort) | Anthropic | — | 46.5 | 1.58 | 5.000 | 25.000 | 1.90 | 44.2 |
| Mistral Large 2 (Jul '24) | Mistral | — | 13.0 | 1.47 | 2.000 | 6.000 | 0.00 | 0.0 |
| Claude Opus 4.5 (Non-reasoning) | Anthropic | — | 43.1 | 1.46 | 5.000 | 25.000 | 1.02 | 48.6 |
| GPT-4o (Aug '24) | OpenAI | — | 18.6 | 1.44 | 2.500 | 10.000 | 0.50 | 104.7 |
| Grok 3 | xAI | — | 25.2 | 1.42 | 3.000 | 15.000 | 0.32 | 67.4 |
| GPT-4o (Nov '24) | OpenAI | — | 17.3 | 1.34 | 2.500 | 10.000 | 0.48 | 163.9 |
| Nova Premier | Amazon | — | 19.0 | 1.29 | 2.500 | 12.500 | 0.87 | 61.8 |
| Jamba 1.7 Large | AI21 Labs | — | 10.9 | 1.05 | 2.000 | 8.000 | 0.77 | 54.6 |
| Command A | Cohere | — | 13.5 | 1.04 | 2.500 | 10.000 | 0.40 | 44.6 |
| Jamba 1.5 Large | AI21 Labs | — | 10.7 | 1.03 | 2.000 | 8.000 | 0.00 | 0.0 |
| Jamba 1.6 Large | AI21 Labs | — | 10.6 | 1.02 | 2.000 | 8.000 | 0.78 | 54.5 |
| Claude 3.5 Sonnet (Oct '24) | Anthropic | — | 15.9 | 0.89 | 3.000 | 15.000 | 0.00 | 0.0 |
| Sonar Pro | Perplexity | — | 15.2 | 0.85 | 3.000 | 15.000 | 1.21 | 114.9 |
| Claude 3.5 Sonnet (June '24) | Anthropic | — | 14.2 | 0.79 | 3.000 | 15.000 | 0.00 | 0.0 |
| Mistral Medium | Mistral | — | 9.0 | 0.73 | 2.750 | 8.100 | 0.38 | 95.2 |
| GPT-4o (May '24) | OpenAI | — | 14.5 | 0.64 | 5.000 | 15.000 | 0.57 | 99.6 |
| Claude 3 Sonnet | Anthropic | — | 10.3 | 0.56 | 3.000 | 15.000 | 0.00 | 0.0 |
| Mistral Large (Feb '24) | Mistral | — | 9.9 | 0.54 | 4.000 | 12.000 | 0.00 | 0.0 |
| Claude 4.1 Opus (Reasoning) | Anthropic | — | 42.0 | 0.45 | 15.000 | 75.000 | 9.43 | 35.1 |
| Command-R+ (Apr '24) | Cohere | — | 8.3 | 0.45 | 3.000 | 15.000 | 0.00 | 0.0 |
| Claude 4 Opus (Reasoning) | Anthropic | — | 39.0 | 0.42 | 15.000 | 75.000 | 7.38 | 34.5 |
| Claude 4.1 Opus (Non-reasoning) | Anthropic | — | 36.0 | 0.38 | 15.000 | 75.000 | 1.57 | 31.6 |
| o1 | OpenAI | — | 30.8 | 0.37 | 15.000 | 60.000 | 17.76 | 119.3 |
| o3-pro | OpenAI | — | 40.7 | 0.37 | 20.000 | 80.000 | 125.10 | 15.0 |
| Claude 4 Opus (Non-reasoning) | Anthropic | — | 33.0 | 0.35 | 15.000 | 75.000 | 1.39 | 31.7 |
| GPT-4 Turbo | OpenAI | — | 13.7 | 0.28 | 10.000 | 30.000 | 1.14 | 35.4 |
| o1-preview | OpenAI | — | 23.7 | 0.25 | 16.500 | 66.000 | 0.00 | 0.0 |
| Claude 3 Opus | Anthropic | — | 18.0 | 0.17 | 15.000 | 75.000 | 0.00 | 0.0 |
| GPT-4 | OpenAI | — | 12.8 | 0.08 | 30.000 | 60.000 | 0.73 | 40.9 |
| o1-pro | OpenAI | — | 25.8 | 0.00 | 150.000 | 600.000 | 0.00 | 0.0 |
| Grok-1 | xAI | 11.7 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| GPT-5.4 Pro (xhigh) | OpenAI | — | — | 30.000 | 180.000 | 0.00 | 0.0 | |
| Gemma 3 27B Instruct | 10.3 | — | 0.000 | 0.000 | 0.77 | 33.6 | ||
| Gemma 3 12B Instruct | 8.8 | — | 0.000 | 0.000 | 29.82 | 34.3 | ||
| Gemma 3 4B Instruct | 6.3 | — | 0.000 | 0.000 | 1.06 | 35.4 | ||
| Gemma 3 1B Instruct | 5.5 | — | 0.000 | 0.000 | 0.55 | 52.4 | ||
| Gemma 3 270M | 7.7 | — | 0.000 | 0.000 | 0.00 | 0.0 | ||
| Gemma 3n E2B Instruct | — | 4.8 | — | 0.000 | 0.000 | 0.38 | 52.3 | |
| Devstral 2 | Mistral | — | 22.0 | — | 0.000 | 0.000 | 0.39 | 79.4 |
| Devstral Small 2 | Mistral | — | 19.5 | — | 0.000 | 0.000 | 0.34 | 205.3 |
| DeepSeek V3.2 Speciale | DeepSeek | 29.4 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| DeepSeek R1 0528 Qwen3 8B | DeepSeek | 16.4 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| R1 1776 | Perplexity | — | 12.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Falcon-H1R-7B | TII UAE | — | 15.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Grok Voice Agent | xAI | — | — | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Phi-4 Multimodal Instruct | Microsoft Azure | — | 10.0 | — | 0.000 | 0.000 | 0.34 | 16.7 |
| Phi-4 Mini Instruct | Microsoft Azure | — | 8.4 | — | 0.000 | 0.000 | 0.31 | 43.9 |
| LFM2 2.6B | Liquid AI | — | 8.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM2.5-1.2B-Instruct | Liquid AI | — | 8.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM2.5-VL-1.6B | Liquid AI | — | 6.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM2.5-1.2B-Thinking | Liquid AI | — | 8.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM2 8B A1B | Liquid AI | — | 7.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Solar Pro 2 (Reasoning) | Upstage | — | 14.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Solar Open 100B (Reasoning) | Upstage | — | 21.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Solar Pro 2 (Non-reasoning) | Upstage | — | 13.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 3.1 Nemotron Nano 4B v1.1 (Reasoning) | NVIDIA | — | 14.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 3.3 Nemotron Super 49B v1 (Reasoning) | NVIDIA | — | 18.5 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) | NVIDIA | — | 14.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Kimi Linear 48B A3B Instruct | Kimi | — | 14.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Step3 VL 10B | StepFun | 15.4 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Molmo 7B-D | Allen Institute for AI | — | 9.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Olmo 3 7B Think | Allen Institute for AI | — | 9.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Olmo 3.1 32B Think | Allen Institute for AI | — | 13.9 | — | 0.000 | 0.000 | 0.65 | 99.0 |
| Molmo2-8B | Allen Institute for AI | — | 7.3 | — | 0.000 | 0.000 | 0.40 | 139.2 |
| Granite 4.0 1B | IBM | — | 7.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Granite 4.0 350M | IBM | — | 6.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Granite 4.0 H 1B | IBM | — | 8.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Granite 4.0 Micro | IBM | — | 7.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Granite 4.0 H 350M | IBM | — | 5.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepHermes 3 - Llama-3.1 8B Preview (Non-reasoning) | Nous Research | — | 7.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepHermes 3 - Mistral 24B Preview (Non-reasoning) | Nous Research | — | 10.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Exaone 4.0 1.2B (Reasoning) | LG AI Research | — | 8.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K-EXAONE (Reasoning) | LG AI Research | — | 32.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Exaone 4.0 1.2B (Non-reasoning) | LG AI Research | — | 8.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| EXAONE 4.0 32B (Reasoning) | LG AI Research | — | 16.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K-EXAONE (Non-reasoning) | LG AI Research | — | 23.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| EXAONE 4.0 32B (Non-reasoning) | LG AI Research | — | 11.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| MiMo-V2-Pro | Xiaomi | 49.2 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Llama 65B | Meta | — | 7.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| mimo-v2-omni | Xiaomi | 43.4 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| ERNIE 5.0 Thinking Preview | Baidu | — | 29.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Sarvam 30B (Reasoning) | Sarvam | — | 12.3 | — | 0.000 | 0.000 | 1.56 | 306.5 |
| Sarvam 105B (Reasoning) | Sarvam | — | 18.2 | — | 0.000 | 0.000 | 1.84 | 105.9 |
| Cogito v2.1 (Reasoning) | Deep Cogito | — | — | — | 1.250 | 1.250 | 0.41 | 93.7 |
| INTELLECT-3 | Prime Intellect | — | 22.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Motif-2-12.7B-Reasoning | Motif Technologies | — | 19.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K2-V2 (medium) | MBZUAI Insti...ation Models | — | 18.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K2 Think V2 | MBZUAI Insti...ation Models | — | 24.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K2-V2 (low) | MBZUAI Insti...ation Models | — | 14.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| K2-V2 (high) | MBZUAI Insti...ation Models | — | 20.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Mi:dm K 2.5 Pro Preview | Korea Telecom | — | — | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Mi:dm K 2.5 Pro | Korea Telecom | — | 23.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| HyperCLOVA X SEED Think (32B) | Naver | — | 23.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LongCat Flash Lite | LongCat | — | 23.9 | — | 0.000 | 0.000 | 3.65 | 127.9 |
| Tri-21B-think Preview | Trillion Labs | — | 20.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Tri-21B-Think | Trillion Labs | — | 18.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Nanbeige4.1-3B | Nanbeige | — | 16.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen Chat 14B | Alibaba | 7.4 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Tiny Aya Global | Cohere | — | 4.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Apriel-v1.6-15B-Thinker | ServiceNow | — | 27.6 | — | 0.000 | 0.000 | 0.22 | 131.4 |
| Jamba 1.7 Mini | AI21 Labs | — | 8.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Jamba Reasoning 3B | AI21 Labs | — | 9.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3.5 4B (Non-reasoning) | Alibaba | 22.6 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Qwen3.5 0.8B (Non-reasoning) | Alibaba | 9.9 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Qwen3.5 2B (Non-reasoning) | Alibaba | 14.7 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Qwen3.5 4B (Reasoning) | Alibaba | — | 27.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3.5 2B (Reasoning) | Alibaba | — | 16.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3.5 9B (Non-reasoning) | Alibaba | — | 27.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3.5 0.8B (Reasoning) | Alibaba | 10.5 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Ling-mini-2.0 | InclusionAI | — | 9.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Ring-1T | InclusionAI | — | 22.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Ling-1T | InclusionAI | — | 19.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Doubao Seed Code | ByteDance Seed | — | 33.5 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| o1-mini | OpenAI | — | 20.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-4o Realtime (Dec '24) | OpenAI | — | — | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-4o mini Realtime (Dec '24) | OpenAI | — | — | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-4.5 (Preview) | OpenAI | — | 20.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-3.5 Turbo (0613) | OpenAI | — | — | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-4o (March 2025, chatgpt-4o-latest) | OpenAI | — | 18.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| GPT-4o (ChatGPT) | OpenAI | — | 14.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 2 Chat 13B | Meta | — | 8.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 2 Chat 70B | Meta | — | 8.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Gemini 2.0 Pro Experimental (Feb '25) | — | 18.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.0 Flash (experimental) | — | 16.8 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.5 Pro (Sep '24) | — | 16.0 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.0 Flash-Lite (Preview) | — | 14.5 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.5 Flash (Sep '24) | — | 13.8 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.5 Flash-8B | — | 11.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.0 Flash Thinking Experimental (Jan '25) | — | 19.6 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.5 Pro (May '24) | — | 12.0 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.5 Flash Preview (Non-reasoning) | — | 17.8 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| PALM-2 | — | 8.6 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.5 Flash Preview (Sep '25) (Reasoning) | — | 31.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.0 Pro | — | 8.5 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.0 Flash Thinking Experimental (Dec '24) | — | 12.3 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.0 Flash-Lite (Feb '25) | — | 14.7 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.5 Pro Preview (Mar' 25) | — | 30.3 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.0 Ultra | — | 10.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 1.5 Flash (May '24) | — | 10.5 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemma 3n E4B Instruct Preview (May '25) | — | 10.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.5 Flash Preview (Reasoning) | — | 24.3 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Gemini 2.5 Flash Preview (Sep '25) (Non-reasoning) | — | 25.7 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Claude Instant | Anthropic | — | 7.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Claude 2.0 | Anthropic | — | 9.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Claude 2.1 | Anthropic | — | 9.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Mixtral 8x22B Instruct | Mistral | — | 9.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Mistral Saba | Mistral | — | 12.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Magistral Small 1 | Mistral | — | 16.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Magistral Medium 1 | Mistral | — | 18.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek R1 Distill Qwen 14B | DeepSeek | — | 15.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek-V2.5 (Dec '24) | DeepSeek | — | 12.5 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek-Coder-V2 | DeepSeek | — | 10.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek R1 Distill Llama 8B | DeepSeek | — | 12.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek LLM 67B Chat (V1) | DeepSeek | — | 8.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek R1 Distill Qwen 1.5B | DeepSeek | — | 9.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek Coder V2 Lite Instruct | DeepSeek | — | 8.5 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DeepSeek-V2.5 | DeepSeek | 12.3 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| DeepSeek-V2-Chat | DeepSeek | 9.1 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Sonar Reasoning Pro | Perplexity | — | 24.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Sonar Reasoning | Perplexity | — | 17.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Grok Beta | xAI | — | 13.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Grok 2 (Dec '24) | xAI | — | 13.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Grok 3 Reasoning Beta | xAI | — | 21.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| OpenChat 3.5 (1210) | OpenChat | — | 8.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Phi-3 Mini Instruct 3.8B | Microsoft Azure | — | 10.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM 40B | Liquid AI | — | 8.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| LFM2 1.2B | Liquid AI | — | 6.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Solar Pro 2 (Preview) (Reasoning) | Upstage | — | 18.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Solar Pro 2 (Preview) (Non-reasoning) | Upstage | — | 16.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| DBRX Instruct | Databricks | — | 8.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| MiniMax M1 40k | MiniMax | — | 20.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Llama 3.1 Tulu3 405B | Allen Institute for AI | — | 14.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Olmo 3 32B Think | Allen Institute for AI | — | 12.1 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| OLMo 2 32B | Allen Institute for AI | — | 10.6 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| OLMo 2 7B | Allen Institute for AI | — | 9.3 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Sarvam M (Reasoning) | Sarvam | — | 8.4 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Apriel-v1.5-15B-Thinker | ServiceNow | — | 28.3 | — | 0.000 | 0.000 | 0.18 | 140.0 |
| Arctic Instruct | Snowflake | — | 8.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen2.5 Instruct 72B | Alibaba | — | 15.6 | — | 0.000 | 0.000 | 1.07 | 55.3 |
| Qwen2.5 Coder Instruct 32B | Alibaba | — | 12.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen2 Instruct 72B | Alibaba | — | 11.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen2.5 Coder Instruct 7B | Alibaba | — | 10.0 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3 4B 2507 Instruct | Alibaba | — | 12.9 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen Chat 72B | Alibaba | — | 8.8 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen2.5 Instruct 32B | Alibaba | — | 13.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3 4B 2507 (Reasoning) | Alibaba | — | 18.2 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen1.5 Chat 110B | Alibaba | — | 9.5 | — | 0.000 | 0.000 | 0.00 | 0.0 |
| Qwen3 VL 4B Instruct | Alibaba | 9.6 | — | 0.000 | 0.000 | 0.00 | 0.0 | |
| Qwen3 VL 4B (Reasoning) | Alibaba | — | 13.7 | — | 0.000 | 0.000 | 0.00 | 0.0 |
Value score (0-100) = normalize(intelligence_index / (3*input_price + output_price)) where missing/invalid prices skip calculation