GPU Container Job

A GPU Container Job runs your workload inside a KubeVirt virtual machine instance (VMI) with a GPU and shell access. You connect to it and work on it like a remote computer, installing what you need and running your code interactively.

When to use one

A GPU Container Job fits when you want direct, interactive shell access to a GPU. Use one to train and fine-tune models, run experiments, or process data.

If you only want to call a language model over an API, a Managed Inference Job is the better fit. It serves the model for you, so you don't set up or run the environment yourself.

What you get

Shell access — you install what you need and run commands directly.
A dedicated GPU — one or more whole GPUs attached to the VMI through PCI passthrough.
A clean environment — the job starts from a base OS image you choose at creation.
VM-level isolation — each job runs in its own VMI.

How it works

A GPU Container Job moves through a simple lifecycle. When you create it from the CLI or the web UI, CosmicAC schedules it on a GPU node in your cluster and provisions a VMI with the GPU attached. Once the VMI is running, you connect and work on it. Stopping the job pauses it, so you can start it again later. Deleting it releases its resources.

For the component-level path a job takes through CosmicAC, see Architecture.

How you connect

You work with a running job from the CLI, which opens a shell on the container. From there the job behaves like any remote machine. You run scripts, start processes, and inspect output directly.

For longer-lived access, you can set up SSH over Tailscale inside the container and reach it from your local machine.

GPU Container Job

When to use one

What you get

How it works

How you connect

Next steps

On this page