Skip to content

Getting Started with EKS


Overview

This self-paced guide demonstrates how to leverage Rafay's system templates for complete lifecycle management of Amazon EKS clusters. You'll learn to streamline cluster provisioning, management, and Day 2 operations using pre-configured, customizable templates from the template catalog.


Why Use System Templates for EKS?

System templates provide significant advantages for EKS cluster management:

  • Consistency & Speed: Pre-configured templates reduce setup time and ensure standardized deployments
  • Governance: Organization administrators can enforce compliance standards while allowing team customization
  • Integration: Seamless workflow integration with tools like ServiceNow and Jira
  • Collaboration: Enhanced team efficiency in managing EKS environments on AWS

Prerequisites

Required Access & Permissions

  • Access to an AWS environment
  • Sufficient privileges to create EKS clusters on AWS
  • An AWS Role ARN or AWS Access Key & Secret with EKS Permissions

Required Components

Agent Deployment & Permissions

When deploying the Rafay agent:

  • Ensure the EC2 instance has an IAM role with required AWS service permissions
  • The agent must be able to assume roles during execution
  • If using AWS Role ARN (instead of Access Key/Secret), the EC2 instance needs permissions to assume that role

Alternatively, the Rafay Agent can be deployed as a pod on your Kubernetes cluster.

  • When deployed this way, the agent pod runs on one of the cluster nodes and uses the IAM role associated with the node group's instance profile.
  • To determine which node is running the agent pod, use:

kubectl get pods -n rafay-system -o wide
Look for the cd-agent-<some hash id> pod and check the NODE column. - Ensure the node group's instance role has an IAM policy allowing sts:AssumeRole for the provisioning role ARN. - Update the trust relationship of the provisioning role to allow assumption by the node group's instance role ARN.

This setup allows the agent pod to assume the necessary role for EKS provisioning and management operations.

  • When creating an EKS cluster with a private endpoint, the Rafay Agent responsible for provisioning and management must be deployed within the same VPC as the EKS cluster. This ensures the agent has network connectivity to the cluster for all required operations.

AWS Role Configuration

1. Create Required IAM Roles

EC2 Machine Role

Attach this role to the EC2 instance hosting the Rafay agent:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "STSPermissions",
      "Effect": "Allow",
      "Action": [
        "sts:AssumeRole",
        "sts:GetCallerIdentity"
      ],
      "Resource": "*"
    },
    {
      "Sid": "IAMPermissions",
      "Effect": "Allow",
      "Action": [
        "iam:PassRole"
      ],
      "Resource": "*"
    }
  ]
}

Security Best Practice

The above policy uses broad permissions for simplicity. For production environments:

  • Replace Resource: "*" with specific role ARNs
  • Add conditions to restrict which roles can be assumed

System Template Role

Create a dedicated role with the required EKS Permissions for cluster provisioning.

2. Configure Trust Relationships

Both roles require trust relationships with each other.

System Template Role Trust Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::xxxxxxxxxx:root"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "sts:ExternalId": "aa4a-6418-ca23-3ece-6c1d"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::xxxxxxxxxx:role/role-test"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

EC2 Machine Role Trust Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com",
        "AWS": "arn:aws:iam::xxxxxxx:role/eks-cluster-provisioning-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Update ARNs

Replace the ARNs in the trust policies with your actual AWS account IDs and role names.

3. Configure EKS Security Groups

The EKS cluster's additional security group must allow inbound access from the Rafay agent:

  • Security Group Type: Additional security group for EKS cluster
  • Inbound Rule: Allow traffic from the agent's network/security group
  • Configuration: Ensure proper communication between Rafay agent and EKS cluster

Step-by-Step Guide

1. Select and Share the AWS EKS System Template

1.1 Create a Project

Create a logically isolated environment for template management:

  1. Navigate to Home > Your Projects
  2. Click Create a New Project
  3. Name it eks-template

Create Project

1.2 Access and Share the Template

  1. As an Org Admin, go to Settings > Template Catalog
  2. Select Cluster LCM category
  3. Choose the AWS EKS System template
  4. Click Get Started
  5. Provide the following details:
  6. Template Name: Unique identifier for your shared template
  7. Version: Version name (e.g., v1)
  8. Target Project: Select eks-template project
  9. After sharing, you'll be redirected to the selected project

Template Catalog Template Redirect

1.3 Configure the Agent

  1. Go to Agents
  2. Configure the required Agent to drive the workflow
  3. Select an existing Agent if already deployed on your AWS private network

Agent Configuration

1.4 Customize Template Configuration

Configure EKS Parameters

Customize and templatize EKS configurations using input variables:

  • Networking: VPC ID, Subnets, Security Groups
  • Node Groups: Instance types, Node counts, Auto-scaling settings
  • Security: IAM roles, Security groups
  • Monitoring: Logging and monitoring configurations

EKS Configuration

Set Parameter Restrictions

Control user access to specific variables:

  • Set overrides to Not Allowed for restricted parameters
  • Define default values for consistency
  • Pre-configure up to 45 parameters for streamlined user experience

Configure AWS Credentials

Navigate to Config Context and provide AWS authentication:

  • Option 1: AWS Access Key and Secret
  • Option 2: AWS Role ARN (requires agent machine to have assume role permissions)

Config Context Config Context Configuration

1.5 Save Template Version

  1. Save as Draft for ongoing edits
  2. Set as Active Version when configuration is finalized
  3. Learn more about version management

Save Template


2. Launch Template to Create EKS Cluster

2.1 Access the Template

  1. Navigate to Environments section in the eks-template project (or shared project)
  2. Locate the shared template in the list

EKS Template

2.2 Launch the Template

  1. Click Launch
  2. Configure the exposed parameters only:
  3. Kubernetes Version
  4. Blueprint Name and Version
  5. Node Group Configuration
  6. Other parameters as defined in template
  7. All other EKS configurations are pre-configured with override: Not Allowed

Launch Template

EC2 IMDS Error Troubleshooting

If you encounter an EC2 IMDS error during deployment:

Error: failed to refresh cached credentials, no EC2 IMDS role found,
operation error ec2imds: GetMetadata, http response error StatusCode: 404,
request to EC2 IMDS failed

Solutions:

  • Option 1: Modify instance metadata settings to make IMDSv2 optional
  • Option 2: Increase the hop limit (>2) for IMDSv2 if it must remain required

References: - Retrieving Instance Metadata - Configuring IMDS


3. Day 2 Operations

3.1 Kubernetes Upgrades

Control Plane Upgrade

  1. Navigate to the EKS cluster environment
  2. Click Edit
  3. Update Kubernetes version (e.g., from 1.31 to 1.32)
  4. Click Redeploy to initiate upgrade

Node Group Upgrade

  1. Edit the environment
  2. Update the cluster_version for node groups
  3. Click Redeploy to apply changes

Independent Upgrades

Control plane and node groups can be upgraded independently of each other.

3.2 Cluster Deletion

  1. Navigate to the EKS cluster environment
  2. Click Destroy
  3. Confirm by selecting Yes
  4. This will delete the EKS cluster and all dependent resources

Additional Resources


Conclusion

You have successfully completed the following:

  • Template Setup: Selected and shared the AWS EKS system template
  • Cluster Management: Performed complete lifecycle management of EKS clusters
  • Day 2 Operations: Learned upgrade and deletion procedures

System templates provide a powerful foundation for:

  • Standardized EKS cluster deployments
  • Compliant organizational governance
  • Flexible workflow integration
  • Efficient team collaboration