Skip to main content
Framework guides

Use ModelSpend with your AI framework.

ModelSpend exposes an OpenAI-compatible proxy endpoint. Any framework or library that supports a custom OpenAI base URL can route through ModelSpend with one environment variable change.

Vercel AI SDK

The Vercel AI SDK (ai package) supports a custom OpenAI provider via @ai-sdk/openai. Point it at the ModelSpend proxy and all generateText, streamText and generateObject calls route through ModelSpend automatically.

Install

npm install ai @ai-sdk/openai

Configure the provider

Set your environment variables, then create a custom OpenAI provider pointed at the ModelSpend proxy:

# .env
MODELSPEND_API_KEY=your-modelspend-api-key
MODELSPEND_BASE_URL=https://api.modelspend.best/proxy/v1
import { createOpenAI } from '@ai-sdk/openai';

const modelspend = createOpenAI({
  apiKey: process.env.MODELSPEND_API_KEY,
  baseURL: process.env.MODELSPEND_BASE_URL
            ?? 'https://api.modelspend.best/proxy/v1',
});

FinOps attribution headers

Pass attribution metadata via headers so the ModelSpend dashboard can group spend by customer, feature, or cost centre.

const modelspend = createOpenAI({
  apiKey: process.env.MODELSPEND_API_KEY,
  baseURL: 'https://api.modelspend.best/proxy/v1',
  headers: {
    'X-ModelSpend-Customer-Id': 'acme',
    'X-ModelSpend-Feature-Id': 'support-triage',
    'X-ModelSpend-Cost-Center': 'customer-success',
  },
});

Timeout and error handling

Use an AbortController for per-request timeouts. Errors from the Vercel AI SDK are standard Error instances you can catch normally.

import { generateText, APICallError } from 'ai';

const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), 30_000);

try {
  const { text } = await generateText({
    model: modelspend('auto'),
    prompt: 'Classify this support ticket.',
    abortSignal: controller.signal,
  });
  console.log(text);
} catch (err) {
  if (err instanceof APICallError) {
    console.error('API error', err.statusCode, err.message);
  } else {
    throw err;
  }
} finally {
  clearTimeout(timer);
}

Generate text

Use model: 'auto' to let ModelSpend choose the cheapest capable model, or pass any model ID your providers support.

import { generateText } from 'ai';

const { text } = await generateText({
  model: modelspend('auto'),
  prompt: 'Summarise this support ticket in two sentences.',
});

console.log(text);

Stream text

import { streamText } from 'ai';

const result = await streamText({
  model: modelspend('auto'),
  prompt: 'Draft a reply to this customer complaint.',
});

for await (const delta of result.textStream) {
  process.stdout.write(delta);
}

Next.js route handler

Use toDataStreamResponse() to stream directly from a Next.js API route:

// app/api/chat/route.ts
import { streamText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';

const modelspend = createOpenAI({
  apiKey: process.env.MODELSPEND_API_KEY,
  baseURL: 'https://api.modelspend.best/proxy/v1',
});

export async function POST(req: Request) {
  const { messages } = await req.json();
  const result = await streamText({
    model: modelspend('auto'),
    messages,
  });
  return result.toDataStreamResponse();
}
API key security — Keep MODELSPEND_API_KEY in server-side code only. Never pass it to a browser bundle or expose it in a client component.

LangChain

LangChain's ChatOpenAI class accepts a custom base URL and API key, making it a one-change swap to route through ModelSpend.

TypeScript / JavaScript

Install

npm install @langchain/openai @langchain/core

Configure and invoke

import { ChatOpenAI } from '@langchain/openai';
import { HumanMessage } from '@langchain/core/messages';

const chat = new ChatOpenAI({
  openAIApiKey: process.env.MODELSPEND_API_KEY,
  modelName: 'auto',
  timeout: 30000,
  configuration: {
    baseURL: 'https://api.modelspend.best/proxy/v1',
  },
});

const response = await chat.invoke([
  new HumanMessage('Summarise this support ticket in two sentences.'),
]);
console.log(response.content);

FinOps attribution headers

Pass attribution metadata via defaultHeaders so dashboard reporting can group spend by customer, feature, or cost centre.

const chat = new ChatOpenAI({
  openAIApiKey: process.env.MODELSPEND_API_KEY,
  modelName: 'auto',
  timeout: 30000,
  configuration: {
    baseURL: 'https://api.modelspend.best/proxy/v1',
    defaultHeaders: {
      'X-ModelSpend-Customer-Id': 'acme',
      'X-ModelSpend-Feature-Id': 'support-triage',
      'X-ModelSpend-Cost-Center': 'customer-success',
    },
  },
});

Python

Install

python -m pip install langchain-openai

Configure and invoke

import os
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(
    openai_api_key=os.environ["MODELSPEND_API_KEY"],
    openai_api_base="https://api.modelspend.best/proxy/v1",
    model_name="auto",
    request_timeout=30,
)

response = chat.invoke([{"role": "user", "content": "Summarise this support ticket."}])
print(response.content)

FinOps attribution headers

chat = ChatOpenAI(
    openai_api_key=os.environ["MODELSPEND_API_KEY"],
    openai_api_base="https://api.modelspend.best/proxy/v1",
    model_name="auto",
    default_headers={
        "X-ModelSpend-Customer-Id": "acme",
        "X-ModelSpend-Feature-Id": "support-triage",
        "X-ModelSpend-Cost-Center": "customer-success",
    },
)
Key safety — Never set MODELSPEND_API_KEY in browser-side LangChain usage. LangChain is a server-side library; keep it in backend code, serverless functions, or jobs.

LlamaIndex

LlamaIndex supports a custom OpenAI base URL via the OpenAI LLM class. Point it at the ModelSpend proxy and all queries, chat engines, and agents route through ModelSpend automatically.

Python

Install

python -m pip install llama-index-llms-openai llama-index-core

Configure and query

import os
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

llm = OpenAI(
    api_key=os.environ["MODELSPEND_API_KEY"],
    api_base="https://api.modelspend.best/proxy/v1",
    model="auto",
    timeout=30,
)

# Use directly
response = llm.complete("Summarise this support ticket in two sentences.")
print(response.text)

# Or set as the global default for all LlamaIndex components
Settings.llm = llm

Chat engine

from llama_index.core.chat_engine import SimpleChatEngine

engine = SimpleChatEngine.from_defaults(llm=llm)
response = engine.chat("Classify this support ticket by urgency.")
print(response.response)

FinOps attribution headers

Pass attribution metadata via additional_kwargs on the LLM so routing telemetry can be grouped by customer, feature, or cost centre.

llm = OpenAI(
    api_key=os.environ["MODELSPEND_API_KEY"],
    api_base="https://api.modelspend.best/proxy/v1",
    model="auto",
    timeout=30,
    additional_kwargs={
        "extra_headers": {
            "X-ModelSpend-Customer-Id": "acme",
            "X-ModelSpend-Feature-Id": "rag-pipeline",
            "X-ModelSpend-Cost-Center": "data-team",
        }
    },
)

TypeScript / JavaScript

Install

npm install llamaindex

Configure and query

import { OpenAI } from 'llamaindex';

const llm = new OpenAI({
  apiKey: process.env.MODELSPEND_API_KEY,
  model: 'auto',
  additionalSessionOptions: {
    baseURL: 'https://api.modelspend.best/proxy/v1',
  },
});

const response = await llm.complete({ prompt: 'Summarise this support ticket.' });
console.log(response.text);
Key safety — Keep MODELSPEND_API_KEY in server-side LlamaIndex code only. LlamaIndex RAG pipelines typically run on the backend; never expose the key to the browser or a public edge function.

Any OpenAI-compatible framework

ModelSpend works with any library that accepts a custom base URL. The two environment variables you always need:

MODELSPEND_API_KEY

Your ModelSpend API key. Replaces the provider API key in any OpenAI-compatible client.

Proxy base URL

https://api.modelspend.best/proxy/v1 — use this wherever the framework accepts a custom OpenAI base URL.