Managed Inference Job

How to connect to a Managed Inference endpoint

Send a chat-completion request to a Managed Inference endpoint from the CLI or an OpenAI-compatible client.

Send a chat-completion request to a Managed Inference endpoint from the CLI or any OpenAI-compatible client.

Prerequisites

You need the following before you start.

A running Managed Inference Job with a serving endpoint. See Create a Managed Inference Job.
Your Managed Inference API key, if the endpoint requires an authorization header. See Create an API key.
The CosmicAC CLI installed and configured, for the CLI method. See Install the CLI.

Steps

Get your connection details

From the endpoint, note the endpoint ID, the model, and your API key. For curl, you also need your deployment's inference URL.

Send a request

Use the CLI or any OpenAI-compatible client.

cosmicac inference chat \
  --endpoint-id <endpoint-id> \
  --model <model> \
  --api-key <api-key> \
  --message "Hello"

Omit --message for an interactive session, or add --stream for streaming output.

The endpoint returns the model's response.

Next steps

How to create an API key (CosmicAC web interface)

Previous Page

Platform management

Next Page

On this page

Prerequisites Steps Get your connection details Send a request Next steps