Skip to content
Dashboard

Sort providers by cost, latency, or throughput on AI Gateway

Link to headingBasic usage

sort-cost
import { streamText } from 'ai';
const result = streamText({
model: 'openai/gpt-oss-120b',
prompt: 'Summarize this internal document.',
providerOptions: {
gateway: {
sort: 'cost', // Use the lowest cost provider first
},
},
});

Sort example by cost for GPT OSS 120B

Link to headingCombine with other routing controls

sort-zdr
import { streamText } from 'ai';
const result = streamText({
model: 'deepseek/deepseek-v4-pro',
prompt,
providerOptions: {
gateway: {
zeroDataRetention: true,
sort: 'ttft', // Among ZDR-compliant providers in this set, try the lowest latency first
},
},
});

Sample ZDR filtering and TTFT sorting for DeepSeek V4 Pro

Link to headingInspecting routing decisions

sample-sort-metadata
{
"gateway": {
"routing": {
"sort": {
"option": "cost",
"executionOrder": ["novita", "groq", "fireworks", "baseten", "cerebras"],
"metrics": {
"novita": 0.10,
"groq": 0.15,
"cerebras": 0.20,
"fireworks": 0.22,
"baseten": 0.25
},
"deprioritizedProviders": ["cerebras"]
}
}
}
}

Sample execution order for GPT OSS 120B

Ready to deploy?