Use ModelSpend with your AI framework.
ModelSpend exposes an OpenAI-compatible proxy endpoint. Any framework or library that supports a custom OpenAI base URL can route through ModelSpend with one environment variable change.
Vercel AI SDK
The Vercel AI SDK (ai package) supports a custom OpenAI provider via @ai-sdk/openai. Point it at the ModelSpend proxy and all generateText, streamText and generateObject calls route through ModelSpend automatically.
Install
Configure the provider
Set your environment variables, then create a custom OpenAI provider pointed at the ModelSpend proxy:
MODELSPEND_API_KEY=your-modelspend-api-key
MODELSPEND_BASE_URL=https://api.modelspend.best/proxy/v1
const modelspend = createOpenAI({
apiKey: process.env.MODELSPEND_API_KEY,
baseURL: process.env.MODELSPEND_BASE_URL
?? 'https://api.modelspend.best/proxy/v1',
});
FinOps attribution headers
Pass attribution metadata via headers so the ModelSpend dashboard can group spend by customer, feature, or cost centre.
apiKey: process.env.MODELSPEND_API_KEY,
baseURL: 'https://api.modelspend.best/proxy/v1',
headers: {
'X-ModelSpend-Customer-Id': 'acme',
'X-ModelSpend-Feature-Id': 'support-triage',
'X-ModelSpend-Cost-Center': 'customer-success',
},
});
Timeout and error handling
Use an AbortController for per-request timeouts. Errors from the Vercel AI SDK are standard Error instances you can catch normally.
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), 30_000);
try {
const { text } = await generateText({
model: modelspend('auto'),
prompt: 'Classify this support ticket.',
abortSignal: controller.signal,
});
console.log(text);
} catch (err) {
if (err instanceof APICallError) {
console.error('API error', err.statusCode, err.message);
} else {
throw err;
}
} finally {
clearTimeout(timer);
}
Generate text
Use model: 'auto' to let ModelSpend choose the cheapest capable model, or pass any model ID your providers support.
const { text } = await generateText({
model: modelspend('auto'),
prompt: 'Summarise this support ticket in two sentences.',
});
console.log(text);
Stream text
const result = await streamText({
model: modelspend('auto'),
prompt: 'Draft a reply to this customer complaint.',
});
for await (const delta of result.textStream) {
process.stdout.write(delta);
}
Next.js route handler
Use toDataStreamResponse() to stream directly from a Next.js API route:
import { streamText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
const modelspend = createOpenAI({
apiKey: process.env.MODELSPEND_API_KEY,
baseURL: 'https://api.modelspend.best/proxy/v1',
});
export async function POST(req: Request) {
const { messages } = await req.json();
const result = await streamText({
model: modelspend('auto'),
messages,
});
return result.toDataStreamResponse();
}
MODELSPEND_API_KEY in server-side code only. Never pass it to a browser bundle or expose it in a client component.
LangChain
LangChain's ChatOpenAI class accepts a custom base URL and API key, making it a one-change swap to route through ModelSpend.
TypeScript / JavaScript
Install
Configure and invoke
import { HumanMessage } from '@langchain/core/messages';
const chat = new ChatOpenAI({
openAIApiKey: process.env.MODELSPEND_API_KEY,
modelName: 'auto',
timeout: 30000,
configuration: {
baseURL: 'https://api.modelspend.best/proxy/v1',
},
});
const response = await chat.invoke([
new HumanMessage('Summarise this support ticket in two sentences.'),
]);
console.log(response.content);
FinOps attribution headers
Pass attribution metadata via defaultHeaders so dashboard reporting can group spend by customer, feature, or cost centre.
openAIApiKey: process.env.MODELSPEND_API_KEY,
modelName: 'auto',
timeout: 30000,
configuration: {
baseURL: 'https://api.modelspend.best/proxy/v1',
defaultHeaders: {
'X-ModelSpend-Customer-Id': 'acme',
'X-ModelSpend-Feature-Id': 'support-triage',
'X-ModelSpend-Cost-Center': 'customer-success',
},
},
});
Python
Install
Configure and invoke
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(
openai_api_key=os.environ["MODELSPEND_API_KEY"],
openai_api_base="https://api.modelspend.best/proxy/v1",
model_name="auto",
request_timeout=30,
)
response = chat.invoke([{"role": "user", "content": "Summarise this support ticket."}])
print(response.content)
FinOps attribution headers
openai_api_key=os.environ["MODELSPEND_API_KEY"],
openai_api_base="https://api.modelspend.best/proxy/v1",
model_name="auto",
default_headers={
"X-ModelSpend-Customer-Id": "acme",
"X-ModelSpend-Feature-Id": "support-triage",
"X-ModelSpend-Cost-Center": "customer-success",
},
)
MODELSPEND_API_KEY in browser-side LangChain usage. LangChain is a server-side library; keep it in backend code, serverless functions, or jobs.
LlamaIndex
LlamaIndex supports a custom OpenAI base URL via the OpenAI LLM class. Point it at the ModelSpend proxy and all queries, chat engines, and agents route through ModelSpend automatically.
Python
Install
Configure and query
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
llm = OpenAI(
api_key=os.environ["MODELSPEND_API_KEY"],
api_base="https://api.modelspend.best/proxy/v1",
model="auto",
timeout=30,
)
# Use directly
response = llm.complete("Summarise this support ticket in two sentences.")
print(response.text)
# Or set as the global default for all LlamaIndex components
Settings.llm = llm
Chat engine
engine = SimpleChatEngine.from_defaults(llm=llm)
response = engine.chat("Classify this support ticket by urgency.")
print(response.response)
FinOps attribution headers
Pass attribution metadata via additional_kwargs on the LLM so routing telemetry can be grouped by customer, feature, or cost centre.
api_key=os.environ["MODELSPEND_API_KEY"],
api_base="https://api.modelspend.best/proxy/v1",
model="auto",
timeout=30,
additional_kwargs={
"extra_headers": {
"X-ModelSpend-Customer-Id": "acme",
"X-ModelSpend-Feature-Id": "rag-pipeline",
"X-ModelSpend-Cost-Center": "data-team",
}
},
)
TypeScript / JavaScript
Install
Configure and query
const llm = new OpenAI({
apiKey: process.env.MODELSPEND_API_KEY,
model: 'auto',
additionalSessionOptions: {
baseURL: 'https://api.modelspend.best/proxy/v1',
},
});
const response = await llm.complete({ prompt: 'Summarise this support ticket.' });
console.log(response.text);
MODELSPEND_API_KEY in server-side LlamaIndex code only. LlamaIndex RAG pipelines typically run on the backend; never expose the key to the browser or a public edge function.
Any OpenAI-compatible framework
ModelSpend works with any library that accepts a custom base URL. The two environment variables you always need:
Your ModelSpend API key. Replaces the provider API key in any OpenAI-compatible client.
https://api.modelspend.best/proxy/v1 — use this wherever the framework accepts a custom OpenAI base URL.