Use
In this step, we will use the end user facing "developer hub" self service portal to deploy an instance of the "Slinky" based Slurm Cluster.
- Navigate to Developer Hub
- Select the project where the template was previously created
- Click "Workspaces"
- Click "New Workspace"
- Enter a name for the workspace
- Click Save
- Click "Custom Services"
- Click "New Custom Service"
- Click "Select" on the Slurm on Kubernetes service card
- Enter a name for the instance
- Select the number of nodes for the cluster
- Enter the public SSH key to be used to access the cluster
- Click Deploy
After a few minutes, the instance will be deployed.
Utilize¶
Next, we will use use the newly deployed Slurm cluster.
- Copy the Slurm Access command from the output
- Update the command with the private SSH key path for the public SSH key provided during deployment
-
Run the command, accept the fingerprint and type "yes" to continue
-
Execute the following command within the SSH session to see the stats of the nodes and partitions
SINFO Command¶
The sinfo command in Slurm (Simple Linux Utility for Resource Management) is used to display information about the state of the nodes and partitions (i.e., queues) in a Slurm-managed cluster.
Type the following command in the SSH session:
sinfo
You will see output similar to the following:
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all* up infinite 2 idle default-[0-1]
¶
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all* up infinite 2 idle default-[0-1]
SQUEUE Command¶
The squeue command in Slurm is used to view the status of jobs in the scheduling queue — both running and pending jobs. Execute the following command within the SSH session to see the status of jobs on the cluster
squeue
You will see output similar to the following:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)