What is it?¶
The Azure AKS Template is a pre built system template designed for managing Azure Kubernetes Service (AKS) lifecycle management, covering both day-0 and day-2 operations. This template is part of the Template Catalog under the Kubernetes Lifecycle Management section and enables organizations to create self-service workflows for end users without requiring extensive configuration knowledge.
This template provides comprehensive AKS management capabilities and is fully supported with regular updates and new features added over time. With these templates, administrators can follow two simple steps to provide a self-service experience for their end users:
- Configure and customize the system template (provide credentials, specify defaults, and determine what values end users can/cannot override) in a project owned by the Platform team
- Publish by sharing the template with end user projects
Prerequisites¶
Before consuming the Azure AKS Template, ensure you have the following prerequisites in place:
1. Healthy GitOps Agent¶
- Deploy a healthy GitOps agent that drives the workflow
- The agent can be deployed as:
- Docker container
- Kubernetes deployment
- The agent's network must have reachability to the network where AKS clusters will be created
- Refer to the GitOps Agent setup documentation for detailed configuration
2. Valid Rafay API Key¶
- Obtain a valid Rafay API key for authentication
- The API key should have appropriate permissions for AKS template operations
- Refer to the API Key management documentation for setup instructions
3. Azure Service Principal Credentials¶
- Configure valid Azure Service Principal credentials with permissions for:
- Authentication
- AKS lifecycle management operations
- Alternative: You can use User Managed Identity for resources in the cluster configuration
- Refer to the AKS credentials documentation for detailed setup instructions
Configuration¶
The AKS System Template includes the following configuration sections:
1. Agent Configuration¶
- GitOps Agent or Agent Pools can be configured at the template level or added at runtime during environment deployment
- Drives workflow execution.
2. Backend Store Type Configuration.¶
- State Store configuration for managing infrastructure state
- Supports System (Rafay-managed), S3, or TFC (Terraform Cloud) backend stores
- Set to empty by default - use System store for quick deployment
3. Rafay-Specific Configuration¶
- Blueprint specification for the cluster configuration
- Project name where the AKS cluster will be created
- Defines the Rafay platform configuration
4. Azure AKS Configuration¶
- Azure-specific settings for AKS cluster creation and management
- Includes region, node pools, networking, and other AKS-specific parameters
5. Credentials¶
- Rafay API Key for platform authentication
- Azure Credentials (Service Principal or User Managed Identity)
- Can be configured at the template level or applied at runtime during environment deployment
Workflow Overview¶
The Azure AKS Template follows a centralized configuration model where platform administrators first configure and customize the template in a central project, then share it with end-user projects for consumption.
graph TD
A[Template Catalog] --> B[Platform Admin: Get Started]
B --> C[Share to Central Project]
C --> D[Configure Template]
D --> E[Customize Input Variables]
E --> F[Set Schedules Optional]
F --> G[Share to End User Projects]
G --> H[End Users Deploy AKS Clusters]
Step-by-Step Guide¶
WARNING
This guide provides general guidance and example configurations only. It may not meet your specific environment requirements or cover all possible configurations. Tailor this configuration steps to your needs.
Step 1: Locate and Initialize the Azure AKS Template¶
- Navigate to the Template Catalog from the home page
- Under Kubernetes Lifecycle Management, locate the Azure AKS card
- Click the Get Started button
- Provide the following details:
- Template name for your organization
- Version identifier
- Central project where you'll configure the template before sharing
Step 2: Configure the Template¶
Once the Azure AKS template is shared to your central project, configure the essential components:
2.1 Add GitOps Agent¶
- Configure the GitOps agent at the template level
- This agent will drive the workflow execution for the deployment.
2.2 Configure Backend Store Type.¶
State Store Change Restriction
Important: You cannot migrate from one state store type to another. Once you have selected a state store type (S3, TFC, or System), you cannot migrate to a different state store type. This change is not supported.
Configure the backend store type for state management. This setting is empty by default and supports the following state store configurations:
Supported State Store Types:
State Store Change Restriction
Important: You cannot migrate from one state store type to another. Once you have selected a state store type (S3, TFC, or System), you cannot migrate to a different state store type. This change is not supported.
Day 2 Configuration Updates
Important: Day 2 configuration updates are not supported in state store for now.
-
System (Recommended for quick deployment)
- State is managed and stored in Rafay's state store
- No need to bring your own state store
- Ideal for getting started quickly without external infrastructure
-
S3
- Use Amazon S3 as the state store
- Provide access credentials (Access ID and Secret) to interact with the S3 endpoint
- Alternatively, use role-based ARN assuming the agent driving the workflow has a role that grants access to the S3 service
-
TFC (Terraform Cloud) - Use Terraform Cloud (TFC) as the state store - Provide TFC-related configuration including organization, workspace, and authentication details
Quick Start
If you're deploying for the first time or testing, select system backend store type. This uses Rafay's managed state store and requires no additional configuration, allowing you to start deploying immediately.
Tooltip for Backend Store Configuration Field
The Backend Store Configuration field displays a tooltip that shows only the configuration fields relevant to your selected backend store type (S3 or TFC). This ensures you only see and configure the necessary settings for your chosen backend store.
Note that some configuration parameters, such as the S3 key, are automatically generated if not explicitly provided, simplifying the configuration process.
Day 2 Configuration Updates
Important: Day 2 updates are not supported for state store configurations that modify references to existing state files (e.g.S3 bucket name). Other configurations like S3 role/secret or TFC token can be updated.
Configuration Approaches:
Option 1: Platform Admin Pre-Configuration (Recommended for Simplified User Experience)
- Update the
state-store-config-contextwith the desired backend store configuration - Set default values for Backend Store Type and Backend Store Configuration input variables
- End users will not need to deal with state store configuration
- Ensures consistent state management across deployments
Option 2: End User Selection
- Allow end users to select their preferred state store during deployment
- End users can choose system for quick deployment without existing infrastructure
- End users with existing state stores (S3 or TFC) can configure accordingly based on their needs
- Provides flexibility for different use cases and environments
Quick Start Recommendation
For quick deployment to test the template, use system backend store type if you don't have any existing state store configured yet. This allows you to use Rafay's managed state store without any additional setup.
Platform Admin Best Practice
Pre-configure the backend store type at the template level using the state-store-config-context to simplify the deployment experience for end users while maintaining centralized state management control.
2.3 Set Up Configuration Context¶
- Configure the
aks-rafay-env-varscontext with:- Azure credentials (Service Principal or User Managed Identity)
- Rafay API key for authentication
- Lock the credentials to prevent end users from modifying them
2.4 Lock Down Credentials¶
This screenshot shows one variable locking, but you can apply the same approach to other credential variables. Set them as non-overrideable so users cannot see or modify them so that credentials are handled implicitly for end users.
Step 3: Customize Input Variables¶
Platform administrators can customize which variables to expose to end users:
3.1 Set Default Values¶
- Blueprint name and version for cluster configuration
- Region for AKS cluster deployment
- Kubernetes version for the cluster
- Cluster tags for resource organization
3.2 Restrict User Inputs¶
- Location restrictions (e.g., only allow specific Azure regions)
- Blueprint restrictions (e.g., only allow approved blueprints)
- Resource limits (e.g., maximum node count)
3.3 Customize Input Variables¶
Step 4: Configure Schedules (Optional)¶
Set up automated schedules for cluster lifecycle management:
- Destroy schedule (e.g., destroy clusters at end of business day)
- Deploy schedule (e.g., recreate clusters in the morning)
- Maintenance windows for updates
Step 5: Share with End User Projects¶
Once configuration is complete, save it as an active version and share the template with end-user projects: 1. Navigate to the template sharing settings 2. Select target end-user projects 3. Publish the template for consumption
Step 6: Enable Approval Hooks (Optional)¶
When you share the template with a central project, approval hooks are disabled by default. If you want to review the plan before applying infrastructure changes, you can enable approval hooks.
There are two types of approval hooks available:
- Apply Before: Requires approval before applying infrastructure changes
- Destroy Before: Requires approval before destroying infrastructure resources
When to Use Approval Hooks
Approval hooks are useful when you need to review the plan before applying IAC deployment. This provides an extra layer of control and validation before infrastructure changes are executed.
To enable approval hooks:
- Navigate to Resources > res-aks-cluster resource
- Click Edit to modify the resource
- Add override values for the approval hooks as shown in the GIF below:
- For apply approval: Set
approval-hook-apply-beforeoverride value - For destroy approval: Set
approval-hook-destroy-beforeoverride value
Here is the value that you need to input to enable approval hook:
[{"name":"approval","options":{"approval":{"type":"internal"}},"type":"approval"}]
Configuration Flexibility¶
This workflow provides flexibility for different organizational needs:
- Fully Managed: Platform admin configures all settings, end users simply deploy
- Hybrid Approach: Some settings pre-configured, others left for end users
- User-Driven: Minimal pre-configuration, maximum end-user control
The recommended approach is the fully managed configuration, which reduces the burden on end users while maintaining security and compliance standards.
End User Flow¶
Once the platform administrator shares the Azure AKS template to end-user projects, end users can easily deploy AKS clusters with minimal configuration effort.
Step 1: Access the Shared Template¶
- Navigate to your project where the Azure AKS template has been shared
- Locate the Azure AKS Template in your available templates
- Click Launch to begin the deployment process
Step 2: Configure Template Inputs¶
Based on the configuration exposed by the platform administrator, provide the necessary inputs:
2.1 Required Configuration¶
- Cluster name for your AKS deployment
- Resource group (if not pre-configured)
- Region (if multiple regions are allowed)
- Node pool configuration (if customizable)
2.2 Optional Configuration¶
- Cluster tags for resource organization
- Network configuration (if exposed by admin)
- Additional labels or annotations
2.3 State Store Configuration (if exposed by platform admin)¶
State Store Change Restriction
Important: You cannot migrate from one state store type to another. Once you have selected a state store type (S3, TFC, or System), you cannot migrate to a different state store type. This change is not supported.
Day 2 Configuration Updates
Important: Day 2 updates are not supported for state store configurations that modify references to existing state files (e.g.S3 bucket name). Other configurations like S3 role/secret or TFC token can be updated.
Configure the state store for managing deployment state:
-
Backend Store Type: Select from the available options:
- System: Use Rafay's managed state store (recommended for quick deployment)
- S3: Use Amazon S3 as the state store
- TFC: Use Terraform Cloud as the state store
-
Backend Store Configuration: Provide the necessary details based on your selected store type:
- For System: No additional configuration required
- For S3: Provide bucket name, region, and access credentials or role ARN
- For TFC: Provide organization, workspace, and authentication details
Quick Start
If you're deploying for the first time or testing, select system backend store type. This uses Rafay's managed state store and requires no additional configuration, allowing you to start deploying immediately.
Tooltip for Backend Store Configuration Field
The Backend Store Configuration field displays a tooltip that shows only the configuration fields relevant to your selected backend store type (S3 or TFC). This ensures you only see and configure the necessary settings for your chosen backend store.
Step 3: Deploy or Save Configuration¶
After providing all required inputs, you have two options:
Option 1: Save and Continue Later¶
- Click Save to store your configuration
- Return later to complete the deployment
Option 2: Save and Deploy¶
- Click Save & Deploy to immediately start the deployment process
- The AKS cluster creation will begin automatically
Step 4: Monitor Deployment Progress¶
Track the deployment progress through the status indicators. The screenshot below shows how to monitor your deployment status.
Step 5: Access Cluster Resources¶
Once the deployment status shows Success, you will receive the following output configuration:
5.1 Cluster Access Information¶
- Kubeconfig file for cluster access
5.2 Resource Information¶
- Resource group where cluster was created
- Node pool details and status
Post-Deployment Information
Once the deployment is finished and shows success, the cluster will be visible under the Infrastructure tab for monitoring and dashboard purposes. All day-2 operations are supported using template edit functionality - you can change values and redeploy as needed.
Step 6: Verify Cluster Access¶
After successful deployment, download the ZTKA KUBECONFIG from Infrastructure > Clusters to interact with the cluster. You can now deploy workloads on this successfully provisioned cluster.
Benefits for End Users¶
- Simplified Deployment: Pre-configured templates reduce complexity
- Consistent Configuration: Standardized settings across all deployments
- Security: Credentials managed by platform administrators
- Compliance: Built-in governance and policy enforcement
- Self-Service: Deploy clusters without waiting for platform team assistance














