Fractional GPUs
In this guide you will setup a Developer Pod SKU for sharing a GPU across developer pods using fractional GPUs through the KAI Scheduler. Specifically, you will allocate a fraction of the GPU's available memory to the user requesting the developer pod.
Assumptions¶
This exercise assumes the following requirements are in place.
- Admin access to the Rafay Operations Console
- A managed Kubernetes cluster with a GPU, preconfigured with the following prerequisites
- KAI Scheduler installed and provisioned for GPU Sharing
1. Load Compute Profile¶
In this section, you will load the compute profile for the Fractional Developer Pod SKU.
- In the Tenant console, navigate to Help -> API Reference -> V3 APIs
- Locate ComputeProfile Apply and expand the section
- Click Try it out
- Enter the project name system-catalog
- Download Profile JSON
- Copy and paste the downloaded profile JSON into the request body
- Click Execute
You should get a 200 response code.
2. Configure Compute Profile¶
In this section, you will configure the compute profile with the specific input variables for your environment.
- In the Tenant console, navigate to SKU Studio -> Compute Profiles
- Select the system-catalog project
- Click Fractional GPU Memory Dev Pod
- Navigate to the "Input Settings" section of the profile
- Update the values in the following sections with the values specific to your environment
| Name | Value |
|---|---|
| Host Cluster Name | Name of the managed kubernetes cluster in inventory |
| Hostname Suffix | Hostname suffix for web access (e.g., 'example.com'). |
| Ingress Class Name | Name of the IngressClass resource to use (e.g., 'nginx'). |
| KeyZ | The name of the KAI Scheduler Queue to be used |
| Kubeconfig | The Kubeconfig of the host cluster |
| Node Type | The node_type value set in inventory for the nodes to be be used in the host cluster |
| Pod Image | The pod image to be used |
- Click Save Changes
3. End User Utilization¶
Finally, you will use a tenant end user account and utilize the a developer pod with a fractional GPU.
- In the Tenant console, navigate to SKU Studio -> Compute Instances
- Select the system-catalog project
- Click New Compute Instance
- Click Select on Fractional GPU Memory Dev Pod card
- Enter a name for the instance
- Click Save & Continue
- Modify any user parameters as needed
- Click Deploy
After 1-2 minutes, the developer pod will be deployed.
- Use the SSH information provided to log into the pod
- Execute the below command to view the GPU
nvidia-smi
The command will show the the details and status of the entire GPU. Only the running processes for the individual pod will be displayed by this command. Users of different pods sharing the GPU will not be able to see running processes of other pods sharing the GPU.





