How to connect to a Managed Inference endpoint
Send a chat-completion request to a Managed Inference endpoint from the CLI or an OpenAI-compatible client.
Send a chat-completion request to a Managed Inference endpoint from the CLI or any OpenAI-compatible client.
Prerequisites
You need the following before you start.
- A running Managed Inference Job with a serving endpoint. See Create a Managed Inference Job.
- Your Managed Inference API key, if the endpoint requires an authorization header. See Create an API key.
- The CosmicAC CLI installed and configured, for the CLI method. See Install the CLI.
Steps
Get your connection details
From the endpoint, note the endpoint ID, the model, and your API key. For curl, you also need your deployment's inference URL.
Send a request
Use the CLI or any OpenAI-compatible client.
cosmicac inference chat \
--endpoint-id <endpoint-id> \
--model <model> \
--api-key <api-key> \
--message "Hello"Omit --message for an interactive session, or add --stream for streaming output.
The endpoint returns the model's response.