Skip to content

Developer Self Service via Cluster Templates

Our recent release update in May adds support for a number of new features and enhancements and we have written about the these enhancements and new features in our blogs. This blog is focused on Cluster Templates for GKE that enables customers to implement a Developer Self Service for Kubernetes clusters.

We added support for cluster templates in early 2022 starting with support for Amazon EKS initially, then followed by cluster templates for Azure AKS and with this release, cluster templates for Google's GKE. Common Use Cases for Cluster Templates are "Ephemeral Clusters" for lower environments such as:

  • Developer Test Beds
  • QA environments
  • Product support to replicate customer issues

Cluster Templates

Non-production clusters such as developer test beds and clusters for QA testing are ephemeral (i.e. needed for a few hours or days). They are generally required at very short notice (need it to be functional within a few minutes). It is not practical to have an Ops/SRE person on staff just to service these requests.

Our customers have been telling us that the right solution for this challenge is to empower and enable developers with the ability to provision and use Kubernetes clusters by themselves. However, developers are extremely challenged for time and have no interest in learning how to use Terraform or details about cloud infrastructure. They just need access to a Kubernetes cluster to test their application when they need it. Giving them unfeterred access to cloud environments is not possible in most organizations because Operations and Security teams need oversight over "which infrastructure resources are created" and "where they are created" for various reasons like cost management, security policies and governance.

Cluster Templates in Rafay were created primarily to address this challenge. Cluster templates allows Platform/Ops/SRE teams to provide a self service experience for developers without losing control over governance and compliance.


How do Cluster Templates work?

At a high level, there are two distinct steps that needs to be followed.

flowchart LR
    subgraph rafay[Rafay]
        ct[Cluster Template]
    end

    subgraph aws[AWS]
        k1[EKS Cluster 1]
        k2[EKS Cluster 2]
    end

    subgraph azure[Azure]
        k3[AKS Cluster 1]
        k4[AKS Cluster 2]
    end

    subgraph gcp[Google Cloud]
        k5[GKE Cluster 1]
        k6[GKE Cluster 2]
    end

    rafay-.->aws
    rafay-.->azure
    rafay-.->gcp

    subgraph app[App Team]
        direction LR
        dev[Developer]
        qa[QA Engineer]
    end

    subgraph ops[Platform Team]
        direction LR
        plat[Platform Engineer]
        operations[Ops/SRE]
    end

    ops--Create Cluster Template-->ct
    app--Uses Cluster Template-->ct

Step 1: Create Cluster Template

This one-time task is performed by an administrative user with a privileged role (e.g. Infra Admin) that is familiar with infrastructure and compliance. In this step, the administrator specifies what needs to be templatized and what are degrees of freedom they would like the developer to experience. In this step, the administrator is essentially doing two things:

  • Specify and encapsulate the freedom/restriction for infrastructure resource creation.
  • Abstract the details of the resource creation by exposing limited configuration for the user to deal with.

Overrides

Overrides are configurations that a developer may be allowed to specify during cluster provisioning. There are three types of override settings in a cluster template. Administrators can use these to control what a developer can/cannot do with a cluster template during cluster provisioning.

  • Not Allowed
  • Allowed
  • Allowed Limited

In the example shown below, the administrator has decided to not allow any overrides except the name of the cluster. Note that everything else is abstracted out for the developer.

Create Cluster Template


Step 2: Use Cluster Template

Once the cluster template has been created, ensure that you provide the developer with a Cluster Admin role. This will allow the developer to provision clusters using cluster templates. We typically see organizations provide their developers with preset configurations for clusters so that developers can replicate infrastructure resources at will, much like a factory assembly line. With cluster templates, organizations experience benefit from the following:

Benefit 1

Organizations do not need to invest in developing and maintaining complex tooling

Benefit 2

Developers do not need to handle sensitive credentials for the cloud provider (AWS, Azure or GCP)

Benefit 3

Developers do not need to learn or familiarize themselves with complex infrastructure

In the example shown below, the developer is only allowed/required to specify the "name" for their cluster. In a nutshell, the developer can have an operational cluster in a single click/command.

Use Cluster Template


Try It Out

Sign up here for a free trial and try it out yourself. We have developed a number of hands-on Getting Started Guides for Cluster Templates for EKS, AKS and GKE.