How to create a Managed Inference Job (CosmicAC web interface)
Create a Managed Inference Job in the CosmicAC web interface, then call your model.
Create a Managed Inference Job in the CosmicAC web interface. You set the basics, select a model, configure the endpoint and hardware, then launch the job. The Job configuration reference describes every field.
Prerequisites
You need the following before you start.
- A running CosmicAC deployment. See Installation.
- Access to the CosmicAC web interface.
Steps
Open the new job form
On the Jobs page, click New Job.
Enter the basics
In the Basics section, fill in the Job name, Location, and Tags.
Select the job type
In the What kind of job? section, select Managed Inference.
Select a model
In the Model to serve section, select a Model. The Serving configuration comes prefilled with the model's default parameters, which an admin sets in the model master. Adjust them as needed. The Job configuration reference describes every serving field.
To find a model, browse the Hugging Face model hub or the vLLM supported models list.
Configure the endpoint
Still in the Model to serve section, set the Endpoint name and Replicas under Endpoint.
Require an API key
Under API key required, keep Require Authorization header enabled. With it enabled, callers send an API key to reach the endpoint. See Create an API key.
Choose the hardware
In the Hardware section, select a GPU and the GPU count.
Review and create the job
In the Review & launch section, confirm the job spec is valid, then click Create job.
Open the endpoint
Wait for the job to go live, then click Open endpoint.
Call your model
Copy the endpoint URL, then send a request with your API key in the Authorization header. Use the example request shown on the endpoint as a starting point.