Skip to content

Overview

In this self-paced exercise, you will configure and provision a GPU enabled Amazon EKS cluster that will run the Triton Inference Server.

Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center,edge and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance for many query types, including real time, batched, ensembles and audio/video streaming.


What Will You Do by Part

Part What will you do?
1 Setup and Configuration
2 Provision an Amazon EKS Cluster with GPUs
3 Create a cluster Blueprint with Nvidia GPU Operator
4 Deploy a Workload to GPU enabled Amazon EKS Cluster
5 Deprovision the EKS cluster

Assumptions

  • You have access to an Amazon AWS account with privileges to create an IAM Role with the default Full IAM Policy to allow provisioning of resources on your behalf as part of the EKS cluster lifecycle.
  • You have a Git client on your laptop.
  • You have the AWS CLI installed and configured