OpenAI compatible API. Attested gateway. Public status.
Nebius Token Factory
Nebius Token Factory models on TrustedRouter with prices, routes, policy notes, and source links.
1 URLbase_url migration
100smodels and routes
0prompt logs by default
nebius
No logs
| Provider | Nebius Token Factory |
|---|---|
| Models | 20 public models |
| Prepaid routes | 18 |
| BYOK routes | 20 |
| Zero data retention | yes |
| Confidential compute | not claimed |
| Provider E2EE | not claimed |
| Policy note | Marked ZDR via TrustedRouter's arrangement — Nebius RETAINS inputs/outputs by default (for speculative decoding); zero retention is an opt-in control, which the deployed Nebius account has enabled. Nebius does not train on customer data. Policy source |
Measured performance
320 samplesContinuously sampled across Nebius Token Factory's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.
| p50 TTFT | — |
|---|---|
| Throughput | — |
| Uptime | 0.00% |
| Model | p50 TTFT | p50 TTFB | Throughput | Uptime | Config excluded | Samples |
|---|---|---|---|---|---|---|
| Qwen/Qwen3-30B-A3B-Instruct-2507 | — | — | — | 0.00% | — | 17 |
| google/gemma-3-27b-it | — | — | — | 0.00% | — | 16 |
| Qwen/Qwen3.5-397B-A17B | — | — | — | 0.00% | — | 17 |
| Qwen/Qwen3-Next-80B-A3B-Thinking | — | — | — | 0.00% | — | 19 |
| MiniMaxAI/MiniMax-M2.5 | — | — | — | 0.00% | — | 18 |
| NousResearch/Hermes-4-405B | — | — | — | 0.00% | — | 18 |
| nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B | — | — | — | 0.00% | — | 26 |
| nvidia/nemotron-3-super-120b-a12b | — | — | — | 0.00% | — | 12 |
| Qwen/Qwen3-32B | — | — | — | 0.00% | — | 17 |
| NousResearch/Hermes-4-70B | — | — | — | 0.00% | — | 22 |
| deepseek-ai/DeepSeek-V4-Pro | — | — | — | 0.00% | — | 21 |
| nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 | — | — | — | 0.00% | — | 16 |
| openai/gpt-oss-120b | — | — | — | 0.00% | — | 20 |
| nvidia/Nemotron-3-Nano-Omni | — | — | — | 0.00% | — | 23 |
| Qwen/Qwen2.5-VL-72B-Instruct | — | — | — | 0.00% | — | 19 |
| Qwen/Qwen3-235B-A22B-Instruct-2507 | — | — | — | 0.00% | — | 13 |
| zai-org/GLM-5.1 | — | — | — | 0.00% | — | 10 |
| meta-llama/Llama-3.3-70B-Instruct | — | — | — | 0.00% | — | 16 |
Provider models
Models served by Nebius Token Factory.
Each row links to pricing, provider, benchmark, and API pages for the model.
| Model | Context | Endpoints | Prompt | Completion | Routes |
|---|---|---|---|---|---|
MiniMaxAI/MiniMax-M2.5MiniMax M2.5 |
204,800 | 2 | $0.33/1M | $1.32/1M | prepaid BYOK |
NousResearch/Hermes-4-405BHermes 4 405B |
131,072 | 2 | $1.1/1M | $3.3/1M | prepaid BYOK |
NousResearch/Hermes-4-70BHermes 4 70B |
131,072 | 2 | $0.143/1M | $0.44/1M | prepaid BYOK |
Qwen/Qwen2.5-VL-72B-InstructQwen2.5 VL 72B Instruct |
32,768 | 2 | $0.22/1M | $0.77/1M | prepaid BYOK |
Qwen/Qwen3-235B-A22B-Instruct-2507Qwen3 235B A22B Instruct 2507 |
131,072 | 2 | $0.22/1M | $0.66/1M | prepaid BYOK |
Qwen/Qwen3-30B-A3B-Instruct-2507Qwen3 30B A3B Instruct 2507 |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
Qwen/Qwen3-32BQwen3 32B |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
Qwen/Qwen3-Next-80B-A3B-ThinkingQwen3 Next 80B A3B Thinking |
131,072 | 2 | $0.165/1M | $1.65/1M | prepaid BYOK |
Qwen/Qwen3.5-397B-A17BQwen3.5 397B A17B |
262,144 | 2 | $0.66/1M | $3.96/1M | prepaid BYOK |
deepseek-ai/DeepSeek-V4-ProDeepSeek V4 Pro |
1,048,576 | 2 | $1.859/1M | $3.718/1M | prepaid BYOK |
google/gemma-2-2b-itgemma 2 2b it |
8,192 | 1 | $0.022/1M | $0.066/1M | BYOK |
google/gemma-3-27b-itGoogle: Gemma 3 27B |
131,072 | 2 | $0.1309/1M | $0.22/1M | prepaid BYOK |
meta-llama/Llama-3.3-70B-InstructLlama 3.3 70B Instruct |
131,072 | 2 | $0.143/1M | $0.44/1M | prepaid BYOK |
meta-llama/Meta-Llama-3.1-8B-InstructMeta Llama 3.1 8B Instruct |
128,000 | 1 | $0.022/1M | $0.066/1M | BYOK |
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1Llama 3_1 Nemotron Ultra 253B v1 |
128,000 | 2 | $0.66/1M | $1.98/1M | prepaid BYOK |
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3BNVIDIA Nemotron 3 Nano 30B A3B |
131,072 | 2 | $0.11/1M | $0.33/1M | prepaid BYOK |
nvidia/Nemotron-3-Nano-OmniNemotron 3 Nano Omni |
131,072 | 2 | $0.165/1M | $0.495/1M | prepaid BYOK |
nvidia/nemotron-3-super-120b-a12bnemotron 3 super 120b a12b |
131,072 | 2 | $0.66/1M | $1.98/1M | prepaid BYOK |
openai/gpt-oss-120bOpenAI: gpt-oss-120b |
131,072 | 2 | $0.165/1M | $0.66/1M | prepaid BYOK |
zai-org/GLM-5.1GLM 5.1 |
204,800 | 2 | $1.54/1M | $4.84/1M | prepaid BYOK |