Compute Instance

In this step, we will launch a compute instance with GPUs into a workspace we created in the prior step. The image below describes what happens behind the scenes

Launch Compute Instance¶

In the Compute Instances menu, click on New Compute Instance.
The user will be presented with the "small" SKU that was published by the PaaS Project Admin in the prior step.

Now, click on Select. This will now require the user to enter the following details:

Unique name for the instance
Description (optional)
Select the workspace where you would like to deploy this instance (e.g. qa)

Info

Names cannot start with numbers or special characters.

In the example below, we have selected the "small" SKU.

The deployment of a compute instance can take ~5 minutes as it creates the required resources. Once it is deployed successfully, it will look like the following.

Note

Unless the compute profile backing the compute instance is configured as a dedicated resource, all compute instances are isolated, virtual clusters that are deployed onto a shared, multi-tenant host Kubernetes cluster.

Troubleshooting¶

The end user is deliberately not shown all the low level debug and diagnostic details. Administrators can track progress and troubleshoot using the Infrastructure Console.

As an Org Admin, login into your Rafay Org and navigate to the Project
Click on Environments and select Environments
Now click on the environment instance that was just launched to back the compute instance

You should now see high level details of status of the environment. In the example below, the admin can see the inputs that were provided by the end user to the template and a high level status and progress of deployment.

Under Activities, click on Show to display low level progress and diagnostic details. In the example below, you can see that the second step has failed due to a misconfiguration.

Access Compute Instance¶

Users can remotely access the remote compute instances using Rafay's Zero Trust Kubectl. The compute instances are "virtual Kubernetes clusters" and users can access these remotely two ways:

Via an "integrated browser shell" in the Rafay Platform OR
Download the zero trust kubeconfig file and use it with the Kubectl CLI utility.

In the steps below, we will try out both scenarios.

Integrated Web based Shell¶

Click on the operational compute instance you want to access

Now, click on the "Kubectl" button.

This will open a shell inside the user's web browser where the user can perform Kubectl operations.

Download & Use Kubeconfig¶

Users can also download the zero trust kubeconfig file an use it with the Kubectl CLI on their laptops.

Click on the download kubeconfig icon and save it on your laptop

Configure your Kubectl CLI with the downloaded Kubeconfig file using the command shown below.

export KUBECONFIG=<name of downloaded file>

Now, let us test connectivity to the the remote cluster using our Kubectl CLI. In the example below, we are getting the list of namespaces on the remote cluster. We will use this approach in the next step to deploy a test application to our remote compute instance.

kubectl get ns
NAME              STATUS   AGE
default           Active   19h
kube-node-lease   Active   19h
kube-public       Active   19h
kube-system       Active   19h
rafay-system      Active   19h

Deploy Test App¶

We will now deploy a simple application configured to request the use of "1 GPU" in our compute instance. There are two ways users can deploy to the remote compute instance.

Kubectl Web Shell¶

We will be deploy a simple Kubernetes application to our compute instance using the zero trust kubectl web shell integrated in the Rafay Platform. This approach is well suited for users that cannot download and install the Kubectl CLI utility on their laptops.

Click on your compute instance in the Developer Hub
Now, click on the Kubectl button.

This will open a kubectl web shell with the compute instance (virtual cluster) operating on the remote host cluster.

Determine GPU Type¶

The GPUs in the shared infrastructure powering your compute instance may be configured with access to "Full GPUs" or "Time Sliced GPUs". The YAML spec for requesting GPUs is different for both scenarios.

Now, let's check what kind of GPUs are attached to our node from above

kubectl describe no name-of-your-node

Info

The name of your node will be auto populated in the web shell.

Look for the section called "Allocated Resources"

Name:               mohan-lab-0
.......
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests      Limits
  --------           --------      ------
  cpu                920m (5%)     3600m (22%)
  memory             1504Mi (11%)  3530Mi (26%)
  ephemeral-storage  0 (0%)        0 (0%)
  hugepages-1Gi      0 (0%)        0 (0%)
  hugepages-2Mi      0 (0%)        0 (0%)
  nvidia.com/gpu     1             1
Events:              <none>

Deploy using Kubectl Browser Shell¶

Scenario 1: Full GPU

Copy/Paste the command from below

kubectl apply -f https://raw.githubusercontent.com/RafaySystems/getstarted/refs/heads/master/gpupaas/full-gpu.yaml

Scenario 2: Time Sliced GPU

Copy/Paste the command from below

kubectl apply -f https://raw.githubusercontent.com/RafaySystems/getstarted/refs/heads/master/gpupaas/timesliced.yaml

For either scenario, you should see something like below. Your pod should be operational in a few seconds.

deployment.apps/nginx-gpu-deployment configured

Check Status¶

You can check the status of the application by typing the following command.

kubectl get po

You should see something like the example below.

NAME                                    READY   STATUS    RESTARTS   AGE
nginx-gpu-deployment-669c8f7d9f-2skhm   1/1     Running   0          3m58s

Deploy using Kubectl CLI¶

Copy the relevant YAML into a file
Use the downloaded kubeconfig file from the prior step, type the following command in the CLI

Deploy Test App¶

kubectl apply -f filename.yaml

In a few seconds, you should see something like the following:

deployment.apps/nginx-gpu-deployment created

Check Status¶

You can check status of the pod like the example below. Our application was able to get access to 1 Nvidia GPU.

kubectl get po
NAME                                    READY   STATUS    RESTARTS   AGE
nginx-gpu-deployment-669c8f7d9f-wsf55   1/1     Running   0          44s