Measured performance

Provider & model performance

Measured time-to-first-token, time-to-first-byte, throughput, and uptime for every LLM provider and model TrustedRouter routes to — continuously sampled, not vendor-claimed.

Last updated 2026-06-15T21:24:01Z
Continuously sampled from TrustedRouter's monitor regions over the 5,000-sample benchmark set — time-to-first-token (TTFT), time-to-first-byte (TTFB), throughput, and success rate measured on real streaming requests, not vendor-claimed. Unsupported route and probe-configuration rows are reported separately and do not count as provider downtime. No prompt or output content is ever stored.

Providers

Ranked by measured p50 time-to-first-token across all of a provider's models in the 5,000-sample benchmark set (22 providers · 2876 samples).

#ProviderModels p50 TTFTThroughputUptimeErrorsConfig excludedSamples
1 mistral 8 0.00% http_402 100% 138
2 together 3 0.00% http_402 100% 132
3 phala 18 0.00% http_402 100% 130
4 parasail 29 0.00% http_402 100% 125
5 tinfoil 5 0.00% http_402 100% 129
6 siliconflow 7 0.00% http_402 100% 127
7 minimax 6 0.00% http_402 100% 121
8 venice 11 0.00% http_402 100% 114
9 cerebras 4 0.00% http_402 100% 142
10 deepinfra 7 0.00% http_402 100% 114
11 gmi 5 0.00% http_402 100% 131
12 xiaomi 5 0.00% http_402 100% 129
13 openai 11 0.00% http_402 100% 147
14 grok 2 0.00% http_402 100% 133
15 zai 12 0.00% http_402 100% 136
16 deepseek 2 0.00% http_402 100% 122
17 gemini 8 0.00% http_402 100% 138
18 nebius 18 0.00% http_402 100% 157
19 kimi 3 0.00% http_402 100% 134
20 novita 63 0.00% http_402 100% 119
21 anthropic 10 0.00% http_402 100% 142
22 lightning 1 0.00% http_402 100% 116

Models

Models sampled in the 5,000-sample benchmark set, fastest measured TTFT first. Rows with few samples are marked — more data sharpens the numbers.

#ModelProvider p50 TTFTp95 TTFTp50 TTFB ThroughputUptimeConfig excludedSamples
1 mistralai/ministral-14b-2512 limited data mistral 0.00% 17
2 moonshotai/kimi-k2.6 together 0.00% 54
3 moonshotai/kimi-k2.5 limited data phala 0.00% 5
4 qwen/qwen3-vl-8b-instruct limited data parasail 0.00% 7
5 qwen/qwen3.5-35b-a3b limited data parasail 0.00% 5
6 meta-llama/llama-3.3-70b-instruct limited data tinfoil 0.00% 19
7 minimax/minimax-m3 limited data siliconflow 0.00% 18
8 minimax/minimax-m2 minimax 0.00% 24
9 z-ai/glm-5-turbo limited data venice 0.00% 10
10 cerebras/gpt-oss-120b cerebras 0.00% 37
11 google/gemma-4-31b-it limited data deepinfra 0.00% 12
12 minimax/minimax-m2.7 limited data minimax 0.00% 17
13 deepseek/deepseek-v3.2 limited data parasail 0.00% 4
14 z-ai/glm-5.1 gmi 0.00% 21
15 xiaomi/mimo-v2-pro xiaomi 0.00% 22
16 google/gemma-3-4b-it deepinfra 0.00% 20
17 google/gemma-4-26b-a4b-it gmi 0.00% 25
18 meta-llama/llama-3.1-70b-instruct limited data deepinfra 0.00% 12
19 openai/gpt-4o limited data openai 0.00% 18
20 mistralai/mistral-medium-3-5 mistral 0.00% 21
21 openai/gpt-5.5 limited data openai 0.00% 16
22 x-ai/grok-4.20 grok 0.00% 75
23 openai/gpt-oss-20b limited data parasail 0.00% 3
24 z-ai/glm-4.6 limited data zai 0.00% 12
25 xiaomi/mimo-v2-flash xiaomi 0.00% 30
26 openai/gpt-oss-120b cerebras 0.00% 33
27 deepseek/deepseek-v4-pro deepseek 0.00% 61
28 tencent/hy3-preview limited data siliconflow 0.00% 16
29 z-ai/glm-4.5-air:free limited data zai 0.00% 14
30 moonshotai/kimi-k2.6 tinfoil 0.00% 32
31 z-ai/glm-5.1 limited data zai 0.00% 11
32 google/gemini-2.5-flash gemini 0.00% 20
33 Qwen/Qwen3.5-397B-A17B limited data nebius 0.00% 6
34 minimax/minimax-m2.7-highspeed minimax 0.00% 22
35 moonshotai/kimi-k2.5 kimi 0.00% 54
36 z-ai/glm-5 limited data venice 0.00% 10
37 cerebras/zai-glm-4.7 cerebras 0.00% 36
38 qwen/qwen3-coder-480b-a35b-instruct limited data novita 0.00% 4
39 minimax/minimax-m2.5 limited data novita 0.00% 4
40 anthropic/claude-opus-4.1 limited data anthropic 0.00% 16
41 google/gemma-4-31b-it lightning 0.00% 116
42 google/gemini-3.1-pro-preview gemini 0.00% 23
43 inclusionai/ling-2.6-1t limited data novita 0.00% 2
44 z-ai/glm-4.6 limited data venice 0.00% 11
45 tencent/hunyuan-a13b-instruct siliconflow 0.00% 20
46 google/gemma-3-12b-it deepinfra 0.00% 20
47 deepseek/deepseek-v4-flash limited data siliconflow 0.00% 18
48 z-ai/glm-5.1 tinfoil 0.00% 29
49 qwen/qwen3.5-27b limited data deepinfra 0.00% 17
50 Qwen/Qwen3-Next-80B-A3B-Thinking limited data nebius 0.00% 10
51 deepseek/deepseek-v4-flash deepseek 0.00% 61
52 qwen/qwen-2.5-7b-instruct together 0.00% 41
53 anthropic/claude-opus-4.7 limited data anthropic 0.00% 10
54 deepseek/deepseek-chat-v3.1 limited data phala 0.00% 6
55 z-ai/glm-4.7 cerebras 0.00% 36
56 z-ai/glm-4.7 limited data zai 0.00% 10
57 qwen/qwen3.6-27b limited data venice 0.00% 10
58 z-ai/glm-4.7 limited data parasail 0.00% 3
59 bytedance/ui-tars-1.5-7b limited data parasail 0.00% 5
60 z-ai/glm-5v-turbo limited data zai 0.00% 14
61 openai/gpt-oss-120b tinfoil 0.00% 30
62 openai/o4-mini limited data openai 0.00% 15
63 openai/gpt-5.4-mini limited data openai 0.00% 12
64 deepseek/deepseek-v4-pro limited data tinfoil 0.00% 19
65 anthropic/claude-haiku-4.5 limited data anthropic 0.00% 14
66 deepseek/deepseek-v4-pro siliconflow 0.00% 24
67 meta-llama/llama-3.3-70b-instruct together 0.00% 37
68 openai/gpt-4.1-nano limited data openai 0.00% 13
69 qwen/qwen3.5-27b limited data phala 0.00% 8
70 qwen/qwen3-vl-8b-instruct limited data novita 0.00% 4
71 z-ai/glm-5 limited data phala 0.00% 12
72 z-ai/glm-5.1 limited data venice 0.00% 11
73 mistralai/mistral-small-2603 limited data mistral 0.00% 17
74 qwen/qwen3.6-35b-a3b limited data parasail 0.00% 3
75 google/gemini-2.5-pro limited data gemini 0.00% 12
76 deepseek/deepseek-v4-pro gmi 0.00% 20
77 z-ai/glm-5-turbo limited data zai 0.00% 13
78 openai/o3 limited data openai 0.00% 17
79 x-ai/grok-4.3 grok 0.00% 58
80 google/gemma-3-27b-it limited data parasail 0.00% 9
81 deepseek/deepseek-prover-v2-671b limited data novita 0.00% 3
82 z-ai/glm-5.1 limited data phala 0.00% 10
83 openai/gpt-4o-mini limited data openai 0.00% 17
84 mistralai/mistral-nemo limited data mistral 0.00% 18
85 MiniMaxAI/MiniMax-M2.5 limited data nebius 0.00% 8
86 qwen/qwen3-next-80b-a3b-instruct limited data parasail 0.00% 7
87 z-ai/glm-5.2 limited data zai 0.00% 9
88 deepseek/deepseek-v3.1-terminus limited data novita 0.00% 1
89 anthropic/claude-opus-4.6 limited data anthropic 0.00% 16
90 meta-llama/llama-3.3-70b-instruct limited data parasail 0.00% 6
91 openai/o3-mini limited data openai 0.00% 12
92 google/gemma-3-27b-it limited data phala 0.00% 7
93 google/gemini-3-flash-preview limited data gemini 0.00% 18
94 moonshotai/kimi-k2.6 kimi 0.00% 42
95 anthropic/claude-sonnet-4.6 limited data anthropic 0.00% 15
96 z-ai/glm-4.6v limited data zai 0.00% 9
97 minimax/minimax-m2.5-highspeed limited data novita 0.00% 2
98 xiaomi/mimo-v2.5-pro xiaomi 0.00% 28
99 google/gemma-4-31b-it gmi 0.00% 32
100 NousResearch/Hermes-4-405B limited data nebius 0.00% 9
101 qwen/qwen3-235b-a22b-thinking-2507 limited data venice 0.00% 13
102 openai/o1 limited data openai 0.00% 10
103 xiaomi/mimo-v2.5 xiaomi 0.00% 22
104 minimax/minimax-m2.5-highspeed limited data minimax 0.00% 14
105 thedrummer/cydonia-24b-v4.1 limited data parasail 0.00% 10
106 deepseek/deepseek-v3.2 limited data phala 0.00% 5
107 z-ai/glm-5 limited data siliconflow 0.00% 14
108 google/gemma-3-27b-it limited data deepinfra 0.00% 16
109 z-ai/glm-5 gmi 0.00% 33
110 qwen/qwen3.5-35b-a3b limited data novita 0.00% 1
111 nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B limited data nebius 0.00% 14
112 nvidia/nemotron-3-super-120b-a12b limited data nebius 0.00% 6
113 anthropic/claude-opus-4 limited data anthropic 0.00% 14
114 google/gemini-3.1-flash-lite-preview limited data gemini 0.00% 11
115 google/gemini-3.5-flash limited data gemini 0.00% 19
116 Qwen/Qwen3-32B limited data nebius 0.00% 9
117 qwen/qwen3-coder-next limited data parasail 0.00% 7
118 anthropic/claude-opus-4.5 anthropic 0.00% 20
119 moonshotai/kimi-k2.7-code kimi 0.00% 38
120 qwen/qwen3.5-397b-a17b limited data parasail 0.00% 5
121 openai/gpt-4.1-mini limited data openai 0.00% 12
122 z-ai/glm-5 limited data zai 0.00% 11
123 mistralai/ministral-3b-2512 limited data mistral 0.00% 15
124 NousResearch/Hermes-4-70B limited data nebius 0.00% 10
125 anthropic/claude-sonnet-4.5 limited data anthropic 0.00% 12
126 minimax/minimax-m2.5 limited data phala 0.00% 5
127 deepseek-ai/DeepSeek-V4-Pro limited data nebius 0.00% 11
128 google/gemma-3-27b-it limited data novita 0.00% 3
129 qwen/qwen3-omni-30b-a3b-thinking limited data novita 0.00% 1
130 mistralai/mistral-small-3.2-24b-instruct limited data mistral 0.00% 16
131 qwen/qwen3-vl-30b-a3b-instruct limited data phala 0.00% 8
132 z-ai/glm-5v-turbo limited data siliconflow 0.00% 17
133 qwen/qwen2.5-vl-72b-instruct limited data phala 0.00% 7
134 xiaomi/mimo-v2.5-pro-ultraspeed xiaomi 0.00% 27
135 mistralai/ministral-8b-2512 mistral 0.00% 24
136 minimax/minimax-m3 minimax 0.00% 29
137 moonshotai/kimi-k2-instruct limited data novita 0.00% 4
138 google/gemma-4-26b-a4b-it limited data deepinfra 0.00% 17
139 z-ai/glm-4.7-flash limited data venice 0.00% 12
140 google/gemini-2.5-flash-lite limited data gemini 0.00% 16
141 google/gemini-3.1-flash-lite limited data gemini 0.00% 19
142 minimax/minimax-m2.1 limited data novita 0.00% 4
143 qwen/qwen3.5-397b-a17b limited data venice 0.00% 14
144 baidu/ernie-4.5-vl-424b-a47b limited data novita 0.00% 4
145 qwen/qwen3.5-9b limited data venice 0.00% 10
146 openai/gpt-oss-20b limited data phala 0.00% 5
147 z-ai/glm-4.5 limited data zai 0.00% 15
148 mistralai/mistral-large limited data mistral 0.00% 10
149 Qwen/Qwen3-30B-A3B-Instruct-2507 limited data nebius 0.00% 10
150 nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 limited data nebius 0.00% 9
151 qwen/qwen-2.5-7b-instruct limited data phala 0.00% 7
152 openai/gpt-oss-120b limited data nebius 0.00% 8
153 nvidia/Nemotron-3-Nano-Omni limited data nebius 0.00% 10
154 minimax/minimax-m2.1-highspeed limited data minimax 0.00% 15
155 anthropic/claude-opus-4.8 limited data anthropic 0.00% 12
156 qwen/qwen3-omni-30b-a3b-instruct limited data novita 0.00% 1
157 Qwen/Qwen2.5-VL-72B-Instruct limited data nebius 0.00% 10
158 qwen/qwen3-30b-a3b-instruct-2507 limited data phala 0.00% 7
159 z-ai/glm-4.7 limited data venice 0.00% 7
160 google/gemma-3-27b-it limited data nebius 0.00% 8
161 qwen/qwen3-next-80b-a3b-instruct limited data novita 0.00% 2
162 z-ai/glm-4.7-flash limited data phala 0.00% 9
163 openai/gpt-oss-120b limited data novita 0.00% 3
164 z-ai/glm-4.7 limited data phala 0.00% 10
165 openai/gpt-oss-120b limited data parasail 0.00% 1
166 moonshotai/kimi-k2-0905 limited data novita 0.00% 1
167 google/gemma-4-31b-it limited data parasail 0.00% 3
168 moonshotai/kimi-k2-thinking limited data novita 0.00% 1
169 mistralai/mistral-small-3.2-24b-instruct limited data parasail 0.00% 9
170 Sao10K/L3-8B-Stheno-v3.2 limited data novita 0.00% 1
171 minimax/minimax-m2.5 limited data parasail 0.00% 2
172 moonshotai/kimi-k2.6 limited data phala 0.00% 6
173 zai-org/glm-4.6 limited data novita 0.00% 2
174 moonshotai/kimi-k2.6 limited data novita 0.00% 1
175 z-ai/glm-5.1 limited data parasail 0.00% 3
176 Qwen/Qwen3-235B-A22B-Instruct-2507 limited data nebius 0.00% 8
177 google/gemma-4-31b-it limited data novita 0.00% 2
178 anthropic/claude-sonnet-4 limited data anthropic 0.00% 13
179 baidu/ernie-4.5-21B-a3b limited data novita 0.00% 2
180 zai-org/GLM-5.1 limited data nebius 0.00% 6
181 stepfun/step-3.5-flash limited data parasail 0.00% 5
182 google/gemma-4-26b-a4b-it limited data parasail 0.00% 2
183 deepseek/deepseek-r1-0528 limited data novita 0.00% 2
184 meta-llama/llama-3.1-8b-instruct limited data novita 0.00% 2
185 openai/gpt-4.1 limited data openai 0.00% 5
186 z-ai/glm-4.5-air limited data zai 0.00% 7
187 zai-org/glm-4.5-air limited data novita 0.00% 1
188 qwen/qwen-mt-plus limited data novita 0.00% 1
189 qwen/qwen3.5-397b-a17b limited data phala 0.00% 4
190 deepseek/deepseek-v4-flash limited data parasail 0.00% 5
191 thedrummer/skyfall-36b-v2 limited data parasail 0.00% 3
192 openai/gpt-oss-120b limited data phala 0.00% 9
193 z-ai/glm-4.5v limited data zai 0.00% 11
194 qwen/qwen3-vl-235b-a22b-instruct limited data novita 0.00% 3
195 qwen/qwen3-vl-235b-a22b-instruct limited data parasail 0.00% 2
196 z-ai/glm-5v-turbo limited data venice 0.00% 6
197 moonshotai/kimi-k2.6 limited data parasail 0.00% 4
198 deepseek/deepseek-v4-pro limited data parasail 0.00% 1
199 deepseek/deepseek-r1-turbo limited data novita 0.00% 1
200 baidu/ernie-4.5-vl-28b-a3b limited data novita 0.00% 2
201 qwen/qwen3-235b-a22b-instruct-2507 limited data novita 0.00% 1
202 meta-llama/llama-4-scout-17b-16e-instruct limited data novita 0.00% 2
203 moonshotai/kimi-k2.5 limited data parasail 0.00% 3
204 meta-llama/llama-4-maverick limited data parasail 0.00% 6
205 qwen/qwen3-next-80b-a3b-thinking limited data novita 0.00% 2
206 meta-llama/Llama-3.3-70B-Instruct limited data nebius 0.00% 5
207 inclusionai/ring-2.6-1t limited data novita 0.00% 2
208 deepseek/deepseek-v4-flash limited data novita 0.00% 2
209 deepseek/deepseek-v3-0324 limited data novita 0.00% 2
210 zai-org/glm-4.6v limited data novita 0.00% 2
211 deepseek/deepseek-v3.2-exp limited data novita 0.00% 1
212 inclusionai/ling-2.6-flash limited data novita 0.00% 1
213 qwen/qwen3-coder-next limited data novita 0.00% 1
214 deepseek/deepseek-v4-pro limited data novita 0.00% 5
215 deepseek/deepseek-r1-distill-llama-70b limited data novita 0.00% 2
216 qwen/qwen-2.5-72b-instruct limited data novita 0.00% 2
217 qwen/qwen3-235b-a22b-fp8 limited data novita 0.00% 1
218 meta-llama/llama-3.3-70b-instruct limited data novita 0.00% 1
219 xiaomimimo/mimo-v2.5-pro limited data novita 0.00% 3
220 sao10k/l3-8b-lunaris limited data novita 0.00% 1
221 deepseek/deepseek-v3.2 limited data novita 0.00% 2
222 kwaipilot/kat-coder-pro limited data novita 0.00% 4
223 qwen/qwen3-vl-30b-a3b-thinking limited data novita 0.00% 1
224 deepseek/deepseek-v3.1 limited data novita 0.00% 2
225 moonshotai/kimi-k2.5 limited data novita 0.00% 1
226 qwen/qwen3-coder-30b-a3b-instruct limited data novita 0.00% 1
227 qwen/qwen3.5-27b limited data novita 0.00% 1
228 zai-org/glm-5.1 limited data novita 0.00% 1
229 qwen/qwen3.5-122b-a10b limited data novita 0.00% 1
230 zai-org/glm-4.5v limited data novita 0.00% 2
231 deepseek/deepseek-v3-turbo limited data novita 0.00% 1
232 qwen/qwen3.6-27b limited data novita 0.00% 1
233 google/gemma-3-12b-it limited data novita 0.00% 1
234 zai-org/glm-4.7-flash limited data novita 0.00% 1
235 qwen/qwen2.5-vl-72b-instruct limited data parasail 0.00% 1
236 qwen/qwen3-vl-30b-a3b-instruct limited data novita 0.00% 1
237 qwen/qwen3-235b-a22b-thinking-2507 limited data novita 0.00% 2
238 arcee-ai/trinity-large-thinking limited data parasail 0.00% 1

Sign in

Choose a sign in method.