Skip to content

Request GPU

In this Get Started guide, you will perform a simple test against a remote Ray endpoint. You will use Ray to request a single GPU for a job, and use the Ray Job Submission API to submit the job to a Ray cluster. This guide assumes the following:

  • You have already created a "Ray as Service" tenant using Rafay
  • You have the https URL and Access Credentials to the remote endpoint.
  • You have Python 3 installed on your laptop

Review Code

Download the file "gpu_task.py" . In this code, you will request a single GPU. You will then perform a small tensor operation on the GPU using PyTorch.

# gpu_task.py
import torch
import ray

# Initialize Ray (it will automatically connect if you are using Ray Job Submission)
ray.init()

@ray.remote(num_gpus=1)
def gpu_task():
    print("Running on GPU" if torch.cuda.is_available() else "No GPU available.")

    # Create a tensor and move it to the GPU
    tensor = torch.randn(1000, 1000).to("cuda" if torch.cuda.is_available() else "cpu")

    # Perform a simple matrix multiplication on the GPU
    result = torch.matmul(tensor, tensor)

    print("Computation finished. Result tensor shape:", result.shape)
    return result.shape

if __name__ == "__main__":
    # Run the remote function that requires a GPU
    future = gpu_task.remote()
    result = ray.get(future)
    print("Result from GPU task:", result)

Ray Initialization

Uses ray.init() to initialize Ray. It will automatically connect to the Ray cluster if running through the Job Submission API.

Remote Function with GPU Request

@ray.remote(num_gpus=1) def gpu_task(): ...

  • The num_gpus=1 parameter requests 1 GPU for the gpu_task function.
  • Uses PyTorch to perform a simple tensor multiplication on the GPU.

Tensor Operation Creates a random tensor and multiplies it with itself using matrix multiplication.

Main Execution Submits the gpu_task and waits for the result using ray.get().


Job Submission Code

Download the source code file "run.py" and open it in your favorite IDE such as VS Code to review it. As you can see from the code snippet below, we will be using Ray's Job Submission Client to submit a job to the remote Ray endpoint.

import ray
import urllib3
from ray.job_submission import JobSubmissionClient

# Suppress the warning about unverified HTTPS requests since 
# we are using self signed certificates for testing 
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Initialize the Ray client
client = JobSubmissionClient(
    "https://alpha-ray.hictl.dev.rafay-edge.net", 
    headers={"Authorization": "Basic YWRtaW46NntOZ2pSZ1lJWg=="}, 
    verify=False  # Disable SSL verification
)

# Submit the job to run the gpu_task.py script
submission = client.submit_job(
    entrypoint="python gpu_task.py", 
    runtime_env={
        "pip": ["torch==1.9.0"],  # Specify pip dependencies
        "working_dir": "./"  # Specify the working directory
    }
)

Now, update the authorization credentials with the base64 encoded credentials for your Ray endpoint. You can use the following command to perform the encoding.

echo -n 'admin:PASSWORD' | base64

Submit Job

In order to submit the job to your remote Ray endpoint,

  • First, in your web browser, access the Ray Dashboard's URL and keep it open. We will monitor the status and progress of the submitted job here.
  • Now, open Terminal and enter the following command
python3 ./run.py 

This will submit the gpu_task.py script to the Ray cluster, request a GPU for the computation, and print the output logs as the job runs. Note that the logs from the job will be streamed.