Skip to content

Provisioning

The following sequence diagram describes the high level steps that are carried out in a sequence during the provisioning process.Customers can optionally automate the entire sequence using Rafay's APIs or automation tools.

Provisioning Worklow


Demo Video

Watch a video of provisioning of a "Converged, Multi Master" Rafay MKS cluster on "CentOS" with only Local Storage".


STEP 1: Select Cluster Configuration

Review the supported cluster configurations and select your desired cluster configuration. This will determine the number of nodes you need to prepare to initiate cluster provisioning.

Type Number of Initial Nodes
Converged, Single Master 1 Nodes (1 Master/Worker)
Dedicated, Single Master 2 Nodes (1 Master + 1 Worker)
Converged, Multi Master 3 Nodes (3 Masters + 1 Worker)
Dedicated, Multi Master 4 Nodes (3 Masters + 1 Worker)

STEP 2: Prepare Nodes

Create VMs or bare metal instances compatible with the infrastructure requirements.

  • Ensure that you have SSH access to all the instances/VMs

Important

Ensure you have the exact number of nodes for initial provisioning as per the cluster configuration from the previous step. Additional worker nodes should be added after the cluster is successfully provisioned


STEP 3: Create a Cluster

  • Login into the Rafay Console.
  • Navigate to the Project where you would like the cluster provisioned.
  • Click on New Cluster
  • Select "Create a New Cluster" and click Continue
  • Select "Environment" as "Data center/Edge"
  • Select "Linux Installer"
  • Give it a name and click Continue

On Prem Cluster Wizard On Prem Cluster Wizard

Cluster Name

  • Provide a "Unique Name" for the cluster

Cluster Location

  • Select a location for the cluster from the drop down list

Cluster Blueprint

  • Select cluster blueprint from the drop down.
  • If you created a custom blueprint, select it and select the blueprint version.
  • If not, accept the default blueprint provided by Rafay

Kubernetes Version

  • Select the kubernetes version that you want to deploy

Operating System

  • Select the OS and Version you used for the nodes

Storage

  • Select GlusterFS if you require distributed storage.
  • If selecting multiple storage types, select the default storage class.

Deployment Options

  • Enable "Approve Nodes Automatically" if you do not require an approval gate for nodes to join the cluster
  • Enable Install GPU drivers if your nodes support GPUs and you want Rafay to provision required drivers

Cluster Configuration

  • Select Multi Master if you selected this cluster configuration
  • Select Dedicated Master if you selected this cluster configuration

Important

Auto Approval of nodes helps streamline the cluster provisioning and expansion workflows by eliminating the "manual" approval gate for nodes to join the cluster.


STEP 4: Download Conjurer and Secrets

  • Review the Node Installation Instructions section on the Rafay Console
  • Download the cluster bootstrap binary (i.e.Rafay Conjurer)
  • Download the cluster activation secrets (i.e. passphrase and credential files)
  • SCP the three (3) files to the nodes you created in the previous step

Important

Note that the activation secrets (passphrase and credentials) are unique per cluster and you cannot reuse this for other clusters.

An illustrative example is provided below. This assumes that you have the three downloaded files in the current working directory. The three files will be securely uploaded to the “/tmp” folder on the instance.

$ scp -i <keypairfile.pem> * ubuntu@<Node's External IP Address>:/tmp 

STEP 5: Perform Preflight Checks

It is strongly recommended to perform the automated preflight tests on every node to ensure that the node has "compatible" hardware, software and configuration. The following preflight test are currently performed:

# Description and Type of Preflight Checks
1 Is the node running a compatible OS and Version?
2 Does the node have minimum CPU Resources?
3 Does the node have minimum Memory Resources ?
4 Does the node have outbound Internet Connectivity?
5 Is the node able to connect to the Rafay Controller?
6 Is the node able to perform a DNS Lookup of Rafay Controller?
7 Is the node able to establish a MTLS connection to the Rafay Controller ?
8 Is the node's time Synchronized with NTP?
9 Does the node have minimum and compatible storage?
10 Is docker already installed on the node?
11 Is Kubernetes already installed on the node?
  • SSH into the node and run the installer using the provided passphrase and credentials.
  • From the node installation instructions, copy the preflight check command and run it

An illustrative example is shown below where the "preflight checks" detected an incompatible node for provisioning.

tar -xjf conjurer-linux-amd64.tar.bz2 && sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster-credentials.pem" -t

[+] Performing pre-tests
    [+] Operating System check
    [+] CPU check
    [+] Memory check
    [+] Internet connectivity check
    [+] Connectivity check to rafay registry
    [+] DNS Lookup to the controller
    [+] Connectivity check to the Controller
    !INFO: Attempting mTLS connection to salt.core.stage.rafay-edge.net:443
    [+] Multiple default routes check
    [+] Time Sync check
    [+] Storage check
    !WARNING: No raw unformatted volume detected with more than 50GB. Cannot configure node as a master or storage node.

[+] Detected following errors during the above checks
    !ERROR: System Memory 28GB is less than the required 32GB.
    !ERROR: Detected a previously installed version of Docker on this node. Please remove the prior Docker package and retry.
    !ERROR: Detected a previously installed version of Kubernetes on this node. Please remove the prior Kubernetes packages (kubectl, kubeadm, kubelet,kubernetes-cni, etc.) and retry.
  • If there are no errors, proceed to the next step
  • If there are warnings or errors, fix the issues, run the preflight check before proceeding to the next step

STEP 6: Run Conjurer

  • From the node installation instructions, copy the provided command to run the Rafay Conjurer binary
  • SSH into the nodes and run the installer using the provided passphrase and credentials.

An illustrative example provided below

sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster.pem -t

[+]  Initiating edge node install

[+] Provisioning node 
      [+] Step 1. Installing node-agent
      [+] Step 2. Setting hostname to node-72djl2g-192-168-0-20-onpremcluster 
      [+] Step 3. Installing credentials on node
      [+] Step 4. Configuring node-agent
      [+] Step 5. Starting node-agent 

[+] Successfully provisioned node 
  • The conjurer is a “cluster bootstrap agent” that connects and registers the nodes with the Rafay Controller. Information about the Controller and authentication credentials for registration is available in the activation secrets files.

  • Once this step is complete, the node will show up on the Rafay Console as DISCOVERED.


STEP 7: Approve Node

This is an optional approval step that acts as a security control to ensure that administrators can inspect and approve a node before it can become part of the cluster.

  • Click on Approve button to approve the node to this cluster
  • In a few seconds, you will see the status of the node being updated to “Approved" in the Rafay Console
  • Once approved, the node is automatically probed and all information about the node is presented to the administrator on the Rafay Console.

STEP 8: Configure Node

This is a mandatory configuration step that allows the infrastructure administrator to specify the “role” for the node. They will also provide critical information such as Internet IP address and storage details for the node.

Important

Without the configuration step, cluster provisioning cannot be initiated.

  • Click on Configure and follow the instructions provided by the wizard
  • Provide at least one Ingress IP for the cluster
  • Select the “Storage” role and select the storage location (unformatted, raw block device) from the drop down list.
  • Click Save

Configure Ingress IP


STEP 9: Initiate Provisioning

At this point, we have provided everything necessary where the Rafay Controller can start provisioning Kubernetes and all required software add-ons. These will be automatically provisioned and configured to operationalize the cluster.

  • Click on Provision
  • A progress bar is displayed showing progress as the software is downloaded, installed and configured on all the nodes.
  • Once provisioning is complete, the cluster will report itself as “READY" to accept workloads.

Important

The end-to-end provisioning process can take 10-40 mins. This is dependent on the number of nodes in your cluster and the Internet bandwidth available to your nodes

Once provisioning is complete, users will be presented with a cluster card on the Rafay Console.

Successfully Provisioned Cluster

  • Click on the cluster name and select the Configuration tab to view the provisioned cluster's configuration details

Provisioned Cluster Config