Routing

Route each request to the right model for the job.

ModelSpend avoids the default pattern of sending every request to the same expensive model. The gateway evaluates the task, policy and provider state before execution.

Routing flow

Prompt Complexity Policy Provider health Model choice Audit log

What routing considers

Prompt complexity

Simple classification, rewrite and extraction tasks can usually use cheaper models than deep reasoning or agentic workflows.

Policy constraints

Tenant rules may restrict providers, require stronger models, enforce budgets or reject unsuitable requests.

Provider state

Routing can account for provider availability, failures, latency and configured fallback paths.

Cost and capability

The selected model should be capable enough for the task without paying unnecessary premium-model cost.

Auto routing

Use the automatic route when you want ModelSpend to select the most appropriate provider and model for a request.

Automatic routing is best for applications with varied workloads, where some requests need deep reasoning and others are simple transformations.

Typical outcomes

Cheap exact task

Extraction, short classification or formatting work can route to a low-cost model.

Balanced task

Customer support, summarisation and moderate analysis can route to a middle tier.

Deep critical task

Complex reasoning, code review or high-risk analysis can route to a stronger model.