Skip to content

Cluster Lifecycle

SLURM Cluster administrators are provided with a one-click deployment of a SLURM Cluster which gets deployed onto a host Kubernetes cluster.


New ClusterΒΆ

Follow these steps to create and launch a new SLURM Cluster.

  • In Developer Hub, navigate to Compute Type -> Slurm Clusters and click New Slurm Cluster
  • Choose Slurm on Kubernetes from the available options and click Select to continue

Select Catalog

Enter the following details in the configuration form:

  • Provide a name for the cluster and select the workspace where the cluster should be deployed
  • Select a Shared Volume Size. The shared volume will be used by all users to store data that can be accessed by both the login and compute nodes
  • Enter an SSH Public Key that will be used by the root user to access the login nodes
  • Select a value for The number of Compute Nodes to be created
  • Select a value for Number of GPUs per Compute Node to be created
  • Select a value for Number of CPUs per Compute Node to be created
  • Select a value for Amount of Memory per Compute Node to be created
  • Click Deploy to provision the SLURM Cluster

Deploy Cluster

After approximately 5 minutes, the SLURM Cluster will be fully provisioned. Once the cluster has finished deploying, the user will be presented with the following information to access the cluster.

Deploy Cluster

Info

The deployment time to provision a SLURM cluster can be impacted by the size of the compute node images being used.


Destroy ClusterΒΆ

Follow these steps to destroy an existing SLURM Cluster.

  • In Developer Hub, navigate to Compute Type -> Slurm Clusters
  • Click Actions -> Delete on the cluster instance to delete

After approximately 2 minutes, the cluster will be fully deprovisioned.