Skip to main content
All products

Inference API

OpenAI-compatible, drop-in.

Every dedicated deployment exposes an OpenAI-compatible API. If your code already speaks to the OpenAI SDK, switching is a base URL and an API key.

One workspace URL routes any model you have deployed — switch models in the request body, no new integration.

What you get

No. 01

No rewrite required

chat.completions, streaming, and the request shape you already use. Change two lines and keep your client.

No. 02

Scoped API keys

Issue keys that authorize a single deployment or the whole workspace. Plaintext shown once; revoke instantly.

No. 03

Stable model routing

Pin a deployment-scoped URL when you want a fixed endpoint, or use the workspace URL and pick the model per request.