OpenAI compatible API. Attested gateway. Public status.

Nebius Token Factory

Nebius Token Factory models on TrustedRouter with prices, routes, policy notes, and source links.

Verify gateway
1 URLbase_url migration
100smodels and routes
0prompt logs by default

nebius

No logs

All providers

ProviderNebius Token Factory
Models20 public models
Prepaid routes18
BYOK routes20
Zero data retentionyes
Confidential computenot claimed
Provider E2EEnot claimed
Policy noteMarked ZDR via TrustedRouter's arrangement — Nebius RETAINS inputs/outputs by default (for speculative decoding); zero retention is an opt-in control, which the deployed Nebius account has enabled. Nebius does not train on customer data.
Policy source

Measured performance

320 samples

Continuously sampled across Nebius Token Factory's routed models — p50 TTFT, throughput, and success rate. Unsupported route and probe-configuration rows are separated from provider downtime. No prompt or output content stored.

p50 TTFT
Throughput
Uptime0.00%
Modelp50 TTFTp50 TTFBThroughputUptimeConfig excludedSamples
Qwen/Qwen3-30B-A3B-Instruct-2507 0.00% 17
google/gemma-3-27b-it 0.00% 16
Qwen/Qwen3.5-397B-A17B 0.00% 17
Qwen/Qwen3-Next-80B-A3B-Thinking 0.00% 19
MiniMaxAI/MiniMax-M2.5 0.00% 18
NousResearch/Hermes-4-405B 0.00% 18
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B 0.00% 26
nvidia/nemotron-3-super-120b-a12b 0.00% 12
Qwen/Qwen3-32B 0.00% 17
NousResearch/Hermes-4-70B 0.00% 22
deepseek-ai/DeepSeek-V4-Pro 0.00% 21
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 0.00% 16
openai/gpt-oss-120b 0.00% 20
nvidia/Nemotron-3-Nano-Omni 0.00% 23
Qwen/Qwen2.5-VL-72B-Instruct 0.00% 19
Qwen/Qwen3-235B-A22B-Instruct-2507 0.00% 13
zai-org/GLM-5.1 0.00% 10
meta-llama/Llama-3.3-70B-Instruct 0.00% 16

Full provider & model leaderboard.

Provider models

Models served by Nebius Token Factory.

Each row links to pricing, provider, benchmark, and API pages for the model.

Model Context Endpoints Prompt Completion Routes
MiniMaxAI/MiniMax-M2.5
MiniMax M2.5
204,800 2 $0.33/1M $1.32/1M prepaid BYOK
NousResearch/Hermes-4-405B
Hermes 4 405B
131,072 2 $1.1/1M $3.3/1M prepaid BYOK
NousResearch/Hermes-4-70B
Hermes 4 70B
131,072 2 $0.143/1M $0.44/1M prepaid BYOK
Qwen/Qwen2.5-VL-72B-Instruct
Qwen2.5 VL 72B Instruct
32,768 2 $0.22/1M $0.77/1M prepaid BYOK
Qwen/Qwen3-235B-A22B-Instruct-2507
Qwen3 235B A22B Instruct 2507
131,072 2 $0.22/1M $0.66/1M prepaid BYOK
Qwen/Qwen3-30B-A3B-Instruct-2507
Qwen3 30B A3B Instruct 2507
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
Qwen/Qwen3-32B
Qwen3 32B
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
Qwen/Qwen3-Next-80B-A3B-Thinking
Qwen3 Next 80B A3B Thinking
131,072 2 $0.165/1M $1.65/1M prepaid BYOK
Qwen/Qwen3.5-397B-A17B
Qwen3.5 397B A17B
262,144 2 $0.66/1M $3.96/1M prepaid BYOK
deepseek-ai/DeepSeek-V4-Pro
DeepSeek V4 Pro
1,048,576 2 $1.859/1M $3.718/1M prepaid BYOK
google/gemma-2-2b-it
gemma 2 2b it
8,192 1 $0.022/1M $0.066/1M BYOK
google/gemma-3-27b-it
Google: Gemma 3 27B
131,072 2 $0.1309/1M $0.22/1M prepaid BYOK
meta-llama/Llama-3.3-70B-Instruct
Llama 3.3 70B Instruct
131,072 2 $0.143/1M $0.44/1M prepaid BYOK
meta-llama/Meta-Llama-3.1-8B-Instruct
Meta Llama 3.1 8B Instruct
128,000 1 $0.022/1M $0.066/1M BYOK
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Llama 3_1 Nemotron Ultra 253B v1
128,000 2 $0.66/1M $1.98/1M prepaid BYOK
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
NVIDIA Nemotron 3 Nano 30B A3B
131,072 2 $0.11/1M $0.33/1M prepaid BYOK
nvidia/Nemotron-3-Nano-Omni
Nemotron 3 Nano Omni
131,072 2 $0.165/1M $0.495/1M prepaid BYOK
nvidia/nemotron-3-super-120b-a12b
nemotron 3 super 120b a12b
131,072 2 $0.66/1M $1.98/1M prepaid BYOK
openai/gpt-oss-120b
OpenAI: gpt-oss-120b
131,072 2 $0.165/1M $0.66/1M prepaid BYOK
zai-org/GLM-5.1
GLM 5.1
204,800 2 $1.54/1M $4.84/1M prepaid BYOK

Sign in

Choose a sign in method.