Provisioning Explained
GKE Cluster Provisioning Steps Flowchart¶
The provisioning of a Google GKE cluster using the controller is handled in multiple steps. These steps are mapped to reconcilers in the Controller. Each reconciler is designed to be independent and aimed at keeping them re-entrant. The reconcilers run and try to take each step to its logical desired state. The following are the steps that happen during the provisioning process of a GKE cluster. Click on a step to read more information about the specific step.
flowchart TD
A[Cluster Initialized] -->
B[Cluster Bootstrap Node Initialized] -->
C[Cluster Provider Infra Initialized] -->
D[Cluster Spec Applied] -->
E[Cluster Control Plane Ready] -->
F[Cluster Nodes Ready] -->
G[Cluster Operator Spec Applied] -->
H[Cluster Healthy] -->
I[Cluster Pivoted] -->
J[Cluster Bootstrap Node Deleted]
click A "./#cluster-initialized"
click B "./#cluster-bootstrap-node-initialized"
click C "./#cluster-provider-infra-initialized"
click D "./#cluster-spec-applied"
click E "./#cluster-control-plane-ready"
click F "./#cluster-nodes-ready"
click G "./#cluster-operator-spec-applied"
click H "./#cluster-healthy"
click I "./#cluster-pivoted"
click J "./#cluster-bootstrap-node-deleted"
GKE Cluster Provisioning Steps Explained¶
Cluster Initialized¶
This step runs most of the "preflight checks" against the configured GCP account and validates the user provided configuration. If an invalid configuration is detected, provisioning fails and an error message is presented to the user. Users can go back and edit the configuration and issue the provision request again. Some examples are listed below:
- For a regional cluster, validate if the region name is a valid GCP region and if the defaultZoneLocations provided belongs to the region provided.
- Validate if the network-name exists in the project provided.
- Validate that the subnet-name belongs to the network specified.
Note
Cannot validate if the user's preBootCommands could fail on the target cluster.
Cluster Bootstrap Node Initialized¶
In order to remotely and securely provision a GKE cluster in a GCP account, the controller requires a footprint in the form of an "ephemeral bootstrap node". In this step, the bootstrap node is created in GCP with a startup-script as metadata to aid with the VM creation. For zonal clusters, the VM is created in the same zone as the zone specified in the zonal cluster configuration.
- Creates a unique agent configuration for the cluster (config for infra-agent).
- Construct a bootstrap-startup-script that describes how to download and deploy the infra-agent on the bootstrap VM. If the config involves a proxy setup, this step configures the proxy config setup in this script.
- For regional clusters, this step asks the user to specify a zone and creates the bootstrap VM in that zone. This must be in a region where you want the cluster brought up.
- This step concludes with a bootstrap VM that is "up and running" with the "infra agent" successfully deployed on the VM.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Provider Infra Initialized¶
This step starts with the "infra agent" connecting to the Controller. Once the secure connection is established, this step executes commands and sets up CAPG (Cluster API Provider for GKE). This step also involves setting up a lightweight "kind cluster" and clusterctl on the VM. All necessary images are downloaded from the Controller's container registry (docker, Kind, Kubectl, clusterctl, cert-manager, and CAPG images). If a network proxy is configured, communications are performed via the proxy path.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Spec Applied¶
This step constructs a "declarative cluster specification" file using the provided configuration. The cluster spec YAML file is then applied to the kind cluster on the bootstrap node to bring up the target GKE cluster in the configured GCP account/project.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Control Plane Ready¶
This step monitors the GKE Control Plane to become ready on the target cluster by checking the controlplane ready status with the CAPG management cluster. This step can take approximately 10-15 minutes for GKE.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Nodes Ready¶
This step ensures all of the worker nodes are operational.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Operator Spec Applied¶
This step executes any preBootCommands you wanted run on the target cluster. The cluster is brought to its desired state based on the configured cluster blueprint.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Healthy¶
This step is deemed as successful once the the controller and the managed cluster have established a successful control channel.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Pivoted¶
Pivot is a Cluster API/CAPG concept of making the workload cluster self-managed. In this step, the CAPI/CAPG management cluster is moved from the bootstrap VM to the target cluster. Once this step is completed, all the required mgmt components will be running on the workload GKE cluster. All future lifecycle management can then be performed locally.
Note
If a failure occurs, this step fails and retries again until this step succeeds.
Cluster Bootstrap Node Deleted¶
Once the cluster has been successfully pivoted, the bootstrap VM is no longer needed and is deleted. Once this step is complete, provisioning is deemed complete.