Skip to content

Provisioning Explained

GKE Cluster Provisioning Steps Flowchart

The provisioning of a Google GKE cluster using the controller is handled in multiple steps. These steps are mapped to reconcilers in the Controller. Each reconciler is designed to be independent and aimed at keeping them re-entrant. The reconcilers run and try to take each step to its logical desired state. The following are the steps that happen during the provisioning process of a GKE cluster. Click on a step to read more information about the specific step.

flowchart TD
    A[Cluster Initialized] -->
    B[Cluster Bootstrap Node Initialized] -->
    C[Cluster Provider Infra Initialized] -->
    D[Cluster Spec Applied] -->
    E[Cluster Control Plane Ready] -->
    F[Cluster Nodes Ready] -->
    G[Cluster Operator Spec Applied] -->
    H[Cluster Healthy] -->
    I[Cluster Pivoted] -->
    J[Cluster Bootstrap Node Deleted]
    click A "./#cluster-initialized"
    click B "./#cluster-bootstrap-node-initialized"
    click C "./#cluster-provider-infra-initialized"
    click D "./#cluster-spec-applied"
    click E "./#cluster-control-plane-ready"
    click F "./#cluster-nodes-ready"
    click G "./#cluster-operator-spec-applied"
    click H "./#cluster-healthy"
    click I "./#cluster-pivoted"
    click J "./#cluster-bootstrap-node-deleted"

GKE Cluster Provisioning Steps Explained

Cluster Initialized

This step runs most of the "preflight checks" against the configured GCP account and validates the user provided configuration. If an invalid configuration is detected, provisioning fails and an error message is presented to the user. Users can go back and edit the configuration and issue the provision request again. Some examples are listed below:

  • For a regional cluster, validate if the region name is a valid GCP region and if the defaultZoneLocations provided belongs to the region provided.
  • Validate if the network-name exists in the project provided.
  • Validate that the subnet-name belongs to the network specified.

Note

Cannot validate if the user's preBootCommands could fail on the target cluster.


Cluster Bootstrap Node Initialized

In order to remotely and securely provision a GKE cluster in a GCP account, the controller requires a footprint in the form of an "ephemeral bootstrap node". In this step, the bootstrap node is created in GCP with a startup-script as metadata to aid with the VM creation. For zonal clusters, the VM is created in the same zone as the zone specified in the zonal cluster configuration.

  • Creates a unique agent configuration for the cluster (config for infra-agent).
  • Construct a bootstrap-startup-script that describes how to download and deploy the infra-agent on the bootstrap VM. If the config involves a proxy setup, this step configures the proxy config setup in this script.
  • For regional clusters, this step asks the user to specify a zone and creates the bootstrap VM in that zone. This must be in a region where you want the cluster brought up.
  • This step concludes with a bootstrap VM that is "up and running" with the "infra agent" successfully deployed on the VM.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Provider Infra Initialized

This step starts with the "infra agent" connecting to the Controller. Once the secure connection is established, this step executes commands and sets up CAPG (Cluster API Provider for GKE). This step also involves setting up a lightweight "kind cluster" and clusterctl on the VM. All necessary images are downloaded from the Controller's container registry (docker, Kind, Kubectl, clusterctl, cert-manager, and CAPG images). If a network proxy is configured, communications are performed via the proxy path.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Spec Applied

This step constructs a "declarative cluster specification" file using the provided configuration. The cluster spec YAML file is then applied to the kind cluster on the bootstrap node to bring up the target GKE cluster in the configured GCP account/project.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Control Plane Ready

This step monitors the GKE Control Plane to become ready on the target cluster by checking the controlplane ready status with the CAPG management cluster. This step can take approximately 10-15 minutes for GKE.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Nodes Ready

This step ensures all of the worker nodes are operational.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Operator Spec Applied

This step executes any preBootCommands you wanted run on the target cluster. The cluster is brought to its desired state based on the configured cluster blueprint.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Healthy

This step is deemed as successful once the the controller and the managed cluster have established a successful control channel.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Pivoted

Pivot is a Cluster API/CAPG concept of making the workload cluster self-managed. In this step, the CAPI/CAPG management cluster is moved from the bootstrap VM to the target cluster. Once this step is completed, all the required mgmt components will be running on the workload GKE cluster. All future lifecycle management can then be performed locally.

Note

If a failure occurs, this step fails and retries again until this step succeeds.


Cluster Bootstrap Node Deleted

Once the cluster has been successfully pivoted, the bootstrap VM is no longer needed and is deleted. Once this step is complete, provisioning is deemed complete.