Framework guides

Use ModelSpend with your AI framework.

ModelSpend exposes an OpenAI-compatible proxy endpoint. Any framework or library that supports a custom OpenAI base URL can route through ModelSpend with one environment variable change.

Vercel AI SDK

The Vercel AI SDK (ai package) supports a custom OpenAI provider via @ai-sdk/openai. Point it at the ModelSpend proxy and all generateText, streamText and generateObject calls route through ModelSpend automatically.

Install

npm install ai @ai-sdk/openai

Configure the provider

Set your environment variables, then create a custom OpenAI provider pointed at the ModelSpend proxy:

 # .env
 MODELSPEND_API_KEY=your-modelspend-api-key
 MODELSPEND_BASE_URL=https://api.modelspend.best/proxy/v1 

 import { createOpenAI } from '@ai-sdk/openai';

 const modelspend = createOpenAI({

  apiKey: process.env.MODELSPEND_API_KEY,

  baseURL: process.env.MODELSPEND_BASE_URL

            ?? 'https://api.modelspend.best/proxy/v1',

});

FinOps attribution headers

Pass attribution metadata via headers so the ModelSpend dashboard can group spend by customer, feature, or cost centre.

 const modelspend = createOpenAI({

  apiKey: process.env.MODELSPEND_API_KEY,

  baseURL: 'https://api.modelspend.best/proxy/v1',

  headers: {

    'X-ModelSpend-Customer-Id': 'acme',

    'X-ModelSpend-Feature-Id': 'support-triage',

    'X-ModelSpend-Cost-Center': 'customer-success',

  },

});

Timeout and error handling

Use an AbortController for per-request timeouts. Errors from the Vercel AI SDK are standard Error instances you can catch normally.

 import { generateText, APICallError } from 'ai';

 const controller = new AbortController();
 const timer = setTimeout(() => controller.abort(), 30_000);

 try {

  const { text } = await generateText({

    model: modelspend('auto'),

    prompt: 'Classify this support ticket.',

    abortSignal: controller.signal,

  });

  console.log(text);

} catch (err) {

  if (err instanceof APICallError) {

    console.error('API error', err.statusCode, err.message);

  } else {

    throw err;

  }

} finally {

  clearTimeout(timer);

}

Generate text

Use model: 'auto' to let ModelSpend choose the cheapest capable model, or pass any model ID your providers support.

 import { generateText } from 'ai';

 const { text } = await generateText({

  model: modelspend('auto'),

  prompt: 'Summarise this support ticket in two sentences.',

});

console.log(text);

Stream text

 import { streamText } from 'ai';

 const result = await streamText({

  model: modelspend('auto'),

  prompt: 'Draft a reply to this customer complaint.',

});

 for await (const delta of result.textStream) {

  process.stdout.write(delta);

}

Next.js route handler

Use toDataStreamResponse() to stream directly from a Next.js API route:

 // app/api/chat/route.ts
 import { streamText } from 'ai';
 import { createOpenAI } from '@ai-sdk/openai';

 const modelspend = createOpenAI({

  apiKey: process.env.MODELSPEND_API_KEY,

  baseURL: 'https://api.modelspend.best/proxy/v1',

});

 export async function POST(req: Request) {

  const { messages } = await req.json();

  const result = await streamText({

    model: modelspend('auto'),

    messages,

  });

  return result.toDataStreamResponse();

}

API key security — Keep MODELSPEND_API_KEY in server-side code only. Never pass it to a browser bundle or expose it in a client component.

LangChain

LangChain's ChatOpenAI class accepts a custom base URL and API key, making it a one-change swap to route through ModelSpend.

TypeScript / JavaScript

Install

npm install @langchain/openai @langchain/core

Configure and invoke

 import { ChatOpenAI } from '@langchain/openai';
 import { HumanMessage } from '@langchain/core/messages';

 const chat = new ChatOpenAI({

  openAIApiKey: process.env.MODELSPEND_API_KEY,

  modelName: 'auto',

  timeout: 30000,

  configuration: {

    baseURL: 'https://api.modelspend.best/proxy/v1',

  },

});

 const response = await chat.invoke([

  new HumanMessage('Summarise this support ticket in two sentences.'),

]);

console.log(response.content);

FinOps attribution headers

Pass attribution metadata via defaultHeaders so dashboard reporting can group spend by customer, feature, or cost centre.

 const chat = new ChatOpenAI({

  openAIApiKey: process.env.MODELSPEND_API_KEY,

  modelName: 'auto',

  timeout: 30000,

  configuration: {

    baseURL: 'https://api.modelspend.best/proxy/v1',

    defaultHeaders: {

      'X-ModelSpend-Customer-Id': 'acme',

      'X-ModelSpend-Feature-Id': 'support-triage',

      'X-ModelSpend-Cost-Center': 'customer-success',

    },

  },

});

Python

Install

python -m pip install langchain-openai

Configure and invoke

 import os
 from langchain_openai import ChatOpenAI

chat = ChatOpenAI(

    openai_api_key=os.environ["MODELSPEND_API_KEY"],

    openai_api_base="https://api.modelspend.best/proxy/v1",

    model_name="auto",

    request_timeout=30,

)

response = chat.invoke([{"role": "user", "content": "Summarise this support ticket."}])
 print(response.content)

FinOps attribution headers

chat = ChatOpenAI(

    openai_api_key=os.environ["MODELSPEND_API_KEY"],

    openai_api_base="https://api.modelspend.best/proxy/v1",

    model_name="auto",

    default_headers={

        "X-ModelSpend-Customer-Id": "acme",

        "X-ModelSpend-Feature-Id": "support-triage",

        "X-ModelSpend-Cost-Center": "customer-success",

    },

)

Key safety — Never set MODELSPEND_API_KEY in browser-side LangChain usage. LangChain is a server-side library; keep it in backend code, serverless functions, or jobs.

LlamaIndex

LlamaIndex supports a custom OpenAI base URL via the OpenAI LLM class. Point it at the ModelSpend proxy and all queries, chat engines, and agents route through ModelSpend automatically.

Python

Install

python -m pip install llama-index-llms-openai llama-index-core

Configure and query

 import os
 from llama_index.llms.openai import OpenAI
 from llama_index.core import Settings

llm = OpenAI(

    api_key=os.environ["MODELSPEND_API_KEY"],

    api_base="https://api.modelspend.best/proxy/v1",

    model="auto",

    timeout=30,

)

 # Use directly

response = llm.complete("Summarise this support ticket in two sentences.")
 print(response.text)

 # Or set as the global default for all LlamaIndex components

Settings.llm = llm

Chat engine

 from llama_index.core.chat_engine import SimpleChatEngine

engine = SimpleChatEngine.from_defaults(llm=llm)

response = engine.chat("Classify this support ticket by urgency.")
 print(response.response)

FinOps attribution headers

Pass attribution metadata via additional_kwargs on the LLM so routing telemetry can be grouped by customer, feature, or cost centre.

llm = OpenAI(

    api_key=os.environ["MODELSPEND_API_KEY"],

    api_base="https://api.modelspend.best/proxy/v1",

    model="auto",

    timeout=30,

    additional_kwargs={

        "extra_headers": {

            "X-ModelSpend-Customer-Id": "acme",

            "X-ModelSpend-Feature-Id": "rag-pipeline",

            "X-ModelSpend-Cost-Center": "data-team",

        }

    },

)

TypeScript / JavaScript

Install

npm install llamaindex

Configure and query

 import { OpenAI } from 'llamaindex';

 const llm = new OpenAI({

  apiKey: process.env.MODELSPEND_API_KEY,

  model: 'auto',

  additionalSessionOptions: {

    baseURL: 'https://api.modelspend.best/proxy/v1',

  },

});

 const response = await llm.complete({ prompt: 'Summarise this support ticket.' });

console.log(response.text);

Key safety — Keep MODELSPEND_API_KEY in server-side LlamaIndex code only. LlamaIndex RAG pipelines typically run on the backend; never expose the key to the browser or a public edge function.

Any OpenAI-compatible framework

ModelSpend works with any library that accepts a custom base URL. The two environment variables you always need:

MODELSPEND_API_KEY

Your ModelSpend API key. Replaces the provider API key in any OpenAI-compatible client.

Proxy base URL

https://api.modelspend.best/proxy/v1 — use this wherever the framework accepts a custom OpenAI base URL.