How to create a Managed Inference Job (CosmicAC web interface)

Create a Managed Inference Job in the CosmicAC web interface, then call your model.

Create a Managed Inference Job in the CosmicAC web interface. You set the basics, select a model, configure the endpoint and hardware, then launch the job. The Job configuration reference describes every field.

Prerequisites

You need the following before you start.

A running CosmicAC deployment. See Installation.
Access to the CosmicAC web interface.

Steps

Open the new job form

On the Jobs page, click New Job.

Enter the basics

In the Basics section, fill in the Job name, Location, and Tags.

Select the job type

In the What kind of job? section, select Managed Inference.

Select a model

In the Model to serve section, select a Model. The Serving configuration comes prefilled with the model's default parameters, which an admin sets in the model master. Adjust them as needed. The Job configuration reference describes every serving field.

To find a model, browse the Hugging Face model hub or the vLLM supported models list.

Configure the endpoint

Still in the Model to serve section, set the Endpoint name and Replicas under Endpoint.

Require an API key

Under API key required, keep Require Authorization header enabled. With it enabled, callers send an API key to reach the endpoint. See Create an API key.

Choose the hardware

In the Hardware section, select a GPU and the GPU count.

Review and create the job

In the Review & launch section, confirm the job spec is valid, then click Create job.

Open the endpoint

Wait for the job to go live, then click Open endpoint.

Call your model

Copy the endpoint URL, then send a request with your API key in the Authorization header. Use the example request shown on the endpoint as a starting point.