The following sequence diagram describes the high level steps that are carried out in a sequence during the provisioning process. Customers can optionally automate the entire sequence using the RCTL CLI or REST APIs or automation tools.
Watch a video of provisioning of a "Converged, Multi Master" upstream Kubernetes cluster on "CentOS" with only Local Storage".
STEP 1: Select Cluster Configuration¶
Review the supported cluster configurations and select your desired cluster configuration. This will determine the number of nodes you need to prepare to initiate cluster provisioning.
|Type||Number of Initial Nodes|
|Converged, Single Master||1 Nodes (1 Master/Worker)|
|Dedicated, Single Master||2 Nodes (1 Master + 1 Worker)|
|Converged, Multi Master||3 Nodes (3 Masters + 1 Worker)|
|Dedicated, Multi Master||4 Nodes (3 Masters + 1 Worker)|
STEP 2: Prepare Nodes¶
Create VMs or bare metal instances compatible with the infrastructure requirements.Ensure that you have SSH access to all the instances/VMs
Ensure you have the exact number of nodes for initial provisioning as per the cluster configuration from the previous step. Additional worker nodes should be added after the cluster is successfully provisioned
STEP 3: Create a Cluster¶
- Login into the Console
- Navigate to the Project where you would like the cluster provisioned.
- Click on New Cluster
- Select "Create a New Cluster" and click Continue
- Select "Environment" as "Data center/Edge"
- Select "Linux Installer"
- Give it a name and click Continue
- Provide a "Unique Name" for the cluster
- Select a location for the cluster from the drop down list
- Select cluster blueprint from the dropdown.
- If you created a custom blueprint, select it and select the blueprint version.
- If not, accept the default blueprint
- Select the kubernetes version that you want to deploy
- Select the OS and Version you used for the nodes
- Select GlusterFS if you require distributed storage.
- If selecting multiple storage types, select the default storage class.
- Enable "Approve Nodes Automatically" if you do not require an approval gate for nodes to join the cluster
- Enable Install GPU drivers if your nodes support GPUs and you want the controller to provision required drivers
Auto Approval of nodes helps streamline the cluster provisioning and expansion workflows by eliminating the "manual" approval gate for nodes to join the cluster.
- Select Multi Master if you selected this cluster configuration
- Select Dedicated Master if you selected this cluster configuration
- Select "Enable Proxy" if the infrastructure being used to provision the cluster is behind a forward proxy.
- Configure the http proxy with the proxy information (ex: http://proxy.example.com:8080)
- Configure the https proxy with the proxy information (ex: http://proxy.example.com:8080)
- Configure No Proxy with Comma separated list of hosts that need connectivity without proxy
- Configure the Root CA certificate of the proxy if proxy is listening on https
- Enable "TLS Termination Proxy" if proxy is listening on https and cannot provide the Root CA certificate of the proxy.
Proxy configuration cannot be changed once the cluster is created.
Default subnet used for pod networking is "10.244.0.0/16" Default subnet used for k8s services is "10.96.0.0/12"
If you want to customize the subnets used for Pod Networking and K8s Services:
- Configure the "Pod Subnet" with the subnet that you want to use.
- Configure the "Service Subnet" with the subnet that you want to use.
Cluster Networking cannot be changed once the cluster is created.
STEP 4: Download Conjurer and Secrets¶
- Review the Node Installation Instructions section on the Web Console
- Download the cluster bootstrap binary (i.e. Conjurer)
- Download the cluster activation secrets (i.e. passphrase and credential files)
- SCP the three (3) files to the nodes you created in the previous step
Note that the activation secrets (passphrase and credentials) are unique per cluster. You cannot reuse this for other clusters.
An illustrative example is provided below. This assumes that you have the three downloaded files in the current working directory. The three files will be securely uploaded to the “/tmp” folder on the instance.
$ scp -i <keypairfile.pem> * ubuntu@<Node's External IP Address>:/tmp
STEP 5: Perform Preflight Checks¶
It is strongly recommended to perform the automated preflight tests on every node to ensure that the node has "compatible" hardware, software and configuration. The following preflight test are currently performed:
|#||Description and Type of Preflight Checks|
|1||Is the node running a compatible OS and Version?|
|2||Does the node have minimum CPU Resources?|
|3||Does the node have minimum Memory Resources ?|
|4||Does the node have outbound Internet Connectivity?|
|5||Is the node able to connect to the Controller?|
|6||Is the node able to perform a DNS Lookup of Controller?|
|7||Is the node able to establish a MTLS connection to the Controller ?|
|8||Is the node's time Synchronized with NTP?|
|9||Does the node have minimum and compatible storage?|
|10||Is docker already installed on the node?|
|11||Is Kubernetes already installed on the node?|
- SSH into the node and run the installer using the provided passphrase and credentials.
- From the node installation instructions, copy the preflight check command and run it
An illustrative example is shown below where the "preflight checks" detected an incompatible node for provisioning.
tar -xjf conjurer-linux-amd64.tar.bz2 && sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster-credentials.pem" -t [+] Performing pre-tests [+] Operating System check [+] CPU check [+] Memory check [+] Internet connectivity check [+] Connectivity check to controller registry [+] DNS Lookup to the controller [+] Connectivity check to the Controller !INFO: Attempting mTLS connection to salt.core.stage.rafay-edge.net:443 [+] Multiple default routes check [+] Time Sync check [+] Storage check !WARNING: No raw unformatted volume detected with more than 50GB. Cannot configure node as a master or storage node. [+] Detected following errors during the above checks !ERROR: System Memory 28GB is less than the required 32GB. !ERROR: Detected a previously installed version of Docker on this node. Please remove the prior Docker package and retry. !ERROR: Detected a previously installed version of Kubernetes on this node. Please remove the prior Kubernetes packages (kubectl, kubeadm, kubelet,kubernetes-cni, etc.) and retry.
- If there are no errors, proceed to the next step
- If there are warnings or errors, fix the issues, run the preflight check before proceeding to the next step
STEP 6: Run Conjurer¶
- From the node installation instructions, copy the provided command to run the conjurer binary
- SSH into the nodes and run the installer using the provided passphrase and credentials.
An illustrative example provided below
sudo ./conjurer -edge-name="onpremcluster" -passphrase-file="onpremcluster-passphrase.txt" -creds-file="onpremcluster.pem -t [+] Initiating edge node install [+] Provisioning node [+] Step 1. Installing node-agent [+] Step 2. Setting hostname to node-72djl2g-192-168-0-20-onpremcluster [+] Step 3. Installing credentials on node [+] Step 4. Configuring node-agent [+] Step 5. Starting node-agent [+] Successfully provisioned node
The conjurer is a “cluster bootstrap agent” that connects and registers the nodes with the Controller. Information about the Controller and authentication credentials for registration is available in the activation secrets files.
Once this step is complete, the node will show up on the Web Console as DISCOVERED.
STEP 7: Approve Node¶
This is an optional approval step that acts as a security control to ensure that administrators can inspect and approve a node before it can become part of the cluster.
- Click on Approve button to approve the node to this cluster
- In a few seconds, you will see the status of the node being updated to “Approved" in the Web Console
- Once approved, the node is automatically probed and all information about the node is presented to the administrator on the Web Console.
STEP 8: Configure Node¶
This is a mandatory configuration step that allows the infrastructure administrator to specify the “role” for the node. They will also provide critical information such as Internet IP address and storage details for the node.
Without the configuration step, cluster provisioning cannot be initiated.
- Click on Configure and follow the instructions provided by the wizard
- Provide at least one Ingress IP for the cluster
- Select the “Storage” role and select the storage location (unformatted, raw block device) from the drop down list.
- Click Save
STEP 9: Initiate Provisioning¶
At this point, we have provided everything necessary where the Controller can start provisioning Kubernetes and all required software add-ons. These will be automatically provisioned and configured to operationalize the cluster.
- Click on Provision
- A progress bar is displayed showing progress as the software is downloaded, installed and configured on all the nodes.
- Once provisioning is complete, the cluster will report itself as “READY" to accept workloads.
The end-to-end provisioning process can take 10-40 mins. This is dependent on the number of nodes in your cluster and the Internet bandwidth available to your nodes
Once provisioning is complete, users will be presented with a cluster card on the Web Console.
- Click on the cluster name and select the Configuration tab to view the provisioned cluster's configuration details
Remote cluster provisioning in remote, low bandwidth locations with unstable networks can be very challenging. Please review how the retry and backoff mechanisms work by default and how they can be customized to suit your requirements.