Compute Instance
We are now ready to launch a compute instance with GPUs in a workspace.
- In the Compute Instances menu, click on New Compute Instance.
- The user will be presented with the "small" SKU that was published by the PaaS Project Admin in the prior step.
Now, click on Select. This will now require the user to enter the following details:
- Unique name for the instance
- Description (optional)
- Select the workspace where you would like to deploy this instance (e.g. qa)
In the example below, we have selected the "small" SKU.
The deployment of a compute instance can take ~5 minutes as it creates the required resources. Once it is deployed successfully, it will look like the following.
Note
Unless the compute profile backing the compute instance is configured as a dedicated resource, all compute instances are isolated, virtual clusters that are deployed onto a shared, multi-tenant host Kubernetes cluster.
Access Compute Instance¶
Users can remotely access the remote compute instances using Rafay's Zero Trust Kubectl. The compute instances are "virtual Kubernetes clusters" and users can access these remotely two ways:
- Via an "integrated browser shell" in the Rafay Platform OR
- Download the zero trust kubeconfig file and use it with the Kubectl CLI utility.
In the steps below, we will try out both scenarios.
Integrated Web based Shell¶
Click on the operational compute instance you want to access
Now, click on the "Kubectl" button.
This will open a shell inside the user's web browser where the user can perform Kubectl operations.
Download & Use Kubeconfig¶
Users can also download the zero trust kubeconfig file an use it with the Kubectl CLI on their laptops.
Click on the download kubeconfig icon and save it on your laptop
Configure your Kubectl CLI with the downloaded Kubeconfig file using the command shown below.
export KUBECONFIG=<name of downloaded file>
Now, let us test connectivity to the the remote cluster using our Kubectl CLI. In the example below, we are getting the list of namespaces on the remote cluster. We will use this approach in the next step to deploy a test application to our remote compute instance.
kubectl get ns
NAME STATUS AGE
default Active 19h
kube-node-lease Active 19h
kube-public Active 19h
kube-system Active 19h
rafay-system Active 19h
Deploy Test App¶
We will now deploy a simple application configured to request the use of "1 GPU" in our compute instance. There are two ways users can deploy to the remote compute instance.
Kubectl Web Shell¶
We will be deploy a simple Kubernetes application to our compute instance using the zero trust kubectl web shell integrated in the Rafay Platform. This approach is well suited for users that cannot download and install the Kubectl CLI utility on their laptops.
- Click on your compute instance in the Developer Hub
- Now, click on the Kubectl button.
This will open a kubectl web shell with the compute instance (virtual cluster) operating on the remote host cluster.
Determine GPU Type¶
The GPUs in the shared infrastructure powering your compute instance may be configured with access to "Full GPUs" or "Time Sliced GPUs". The YAML spec for requesting GPUs is different for both scenarios.
Now, let's check what kind of GPUs are attached to our node from above
kubectl describe no name-of-your-node
Info
The name of your node will be auto populated in the web shell.
Look for the section called "Allocated Resources"
Name: mohan-lab-0
.......
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 920m (5%) 3600m (22%)
memory 1504Mi (11%) 3530Mi (26%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
nvidia.com/gpu 1 1
Events: <none>
Deploy using Kubectl Browser Shell¶
Scenario 1: Full GPU
Copy/Paste the command from below
kubectl apply -f https://raw.githubusercontent.com/RafaySystems/getstarted/refs/heads/master/gpupaas/full-gpu.yaml
Scenario 2: Time Sliced GPU
Copy/Paste the command from below
kubectl apply -f https://raw.githubusercontent.com/RafaySystems/getstarted/refs/heads/master/gpupaas/timesliced.yaml
For either scenario, you should see something like below. Your pod should be operational in a few seconds.
deployment.apps/nginx-gpu-deployment configured
Check Status¶
You can check the status of the application by typing the following command.
kubectl get po
You should see something like the example below.
NAME READY STATUS RESTARTS AGE
nginx-gpu-deployment-669c8f7d9f-2skhm 1/1 Running 0 3m58s
Deploy using Kubectl CLI¶
- Copy the relevant YAML into a file
- Use the downloaded kubeconfig file from the prior step, type the following command in the CLI
Deploy Test App¶
kubectl apply -f filename.yaml
In a few seconds, you should see something like the following:
deployment.apps/nginx-gpu-deployment created
Check Status¶
You can check status of the pod like the example below. Our application was able to get access to 1 Nvidia GPU.
kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-gpu-deployment-669c8f7d9f-wsf55 1/1 Running 0 44s