Skip to content

Fractional GPUs

In this guide you will setup a Developer Pod SKU for sharing a GPU across developer pods using fractional GPUs through the KAI Scheduler. Specifically, you will allocate a fraction of the GPU's available memory to the user requesting the developer pod.


Assumptions

This exercise assumes the following requirements are in place.

  • Admin access to the Rafay Operations Console
  • A managed Kubernetes cluster with a GPU, preconfigured with the following prerequisites
  • KAI Scheduler installed and provisioned for GPU Sharing

1. Load Compute Profile

In this section, you will load the compute profile for the Fractional Developer Pod SKU.

  • In the Tenant console, navigate to Help -> API Reference -> V3 APIs
  • Locate ComputeProfile Apply and expand the section
  • Click Try it out
  • Enter the project name system-catalog
  • Download Profile JSON
  • Copy and paste the downloaded profile JSON into the request body
  • Click Execute

You should get a 200 response code.

Load Profile


2. Configure Compute Profile

In this section, you will configure the compute profile with the specific input variables for your environment.

  • In the Tenant console, navigate to SKU Studio -> Compute Profiles
  • Select the system-catalog project
  • Click Fractional GPU Memory Dev Pod
  • Navigate to the "Input Settings" section of the profile
  • Update the values in the following sections with the values specific to your environment
Name Value
Host Cluster Name Name of the managed kubernetes cluster in inventory
Hostname Suffix Hostname suffix for web access (e.g., 'example.com').
Ingress Class Name Name of the IngressClass resource to use (e.g., 'nginx').
KeyZ The name of the KAI Scheduler Queue to be used
Kubeconfig The Kubeconfig of the host cluster
Node Type The node_type value set in inventory for the nodes to be be used in the host cluster
Pod Image The pod image to be used
  • Click Save Changes

Compute Profile


3. End User Utilization

Finally, you will use a tenant end user account and utilize the a developer pod with a fractional GPU.

  • In the Tenant console, navigate to SKU Studio -> Compute Instances
  • Select the system-catalog project
  • Click New Compute Instance
  • Click Select on Fractional GPU Memory Dev Pod card
  • Enter a name for the instance
  • Click Save & Continue

Compute Instance

  • Modify any user parameters as needed
  • Click Deploy

Compute Instance

After 1-2 minutes, the developer pod will be deployed.

Compute Instance

  • Use the SSH information provided to log into the pod
  • Execute the below command to view the GPU
nvidia-smi