Air-Gapped Controller Installation Guide
This guide provides detailed instructions for installing the Rafay Controller in air-gapped environments. Watch the video below for the high level installation steps.
Components¶
The image below describes the various software components that are automatically installed and configured by Rafay's installer for the air gapped controller. The components span a few layers
- Rafay Controller Application Layer
- Observability Layer
- Software Infrastructure Layer
- Kubernetes Cluster Layer
Note
The provisioning and lifecycle management of the underlying VMs or Servers is the responsibility of the operator.
Step-by-Step Installation¶
sequenceDiagram
participant User
participant Node
participant DNS
participant RafaySupport
User->>Node: 1. Verify system meets prerequisites
User->>DNS: 2. Create internal and external DNS records
User->>RafaySupport: 3. Request and download package tarball
User->>Node: 4. Extract (untar) package
User->>Node: 5. Update config.yaml with required values
User->>Node: 6. Run `radm init --config config.yaml`
Note right of Node: ⚠️ INIT STEP — Initializes Rafay Control Plane<br>Installs Kubernetes layer, storage, and registry
Node-->>User: Output post-init instructions
User->>Node: 7. Verify installation (e.g., check pods)
User->>Node: 8. Run `sudo radm dependency --config config.yaml`
Note right of Node: ⚠️ INSTALLS DEPENDENCIES — Installs components like cert-manager, Kafka, Istio, MinIO, etc.
User->>Node: 9. Run `sudo radm application --config config.yaml`
Note right of Node: ⚠️ DEPLOYS RAFAY SERVICES — Installs controller applications and services
User->>Node: 10. Access Console UI via console.<domain>
User->>Node: 11. Sign up with user/org details
User->>Node: 12. Login to Rafay controller dashboard
User->>Node: 13. Run `sudo radm cluster --config config.yaml`
Note right of Node: ⚠️ PUSH CLUSTER ARTIFACTS — Uploads cluster images and assets to Build-in Nexus registry
1. Prerequisites¶
1.1. Infrastructure Requirements¶
-
Operating System:
- Ubuntu 24.04
- RHEL 8
- RHEL 9
-
Instance Requirements:
- Single Node Controller: 1 node
- High Availability Controller: 3 master nodes
-
System Size (Minimum):
- 'S': 16 CPU, 64GB memory (Non-HA)
- 'M': 32 CPU, 64GB memory
- 'L': 64 CPU, 128GB memory
- Root Disk: Minimum 500 GB
- Temp Directory (
/tmp
): Minimum 50GB (if not part of root disk) - Data Disk: 1 TB (mounted as
/data
volume, size varies based on storage requirements)
- RHEL installations need connectivity to default repository servers
- Inbound port 443/tcp must be allowed to all instances
- All localhost ports must be reachable
- Port 30053/UDP must be reachable in non-DNS environments
- SELinux/firewall must be disabled on all nodes
1.2. DNS Configuration¶
DNS records are required for the controller to function properly. Replace rafay.example.com
with your desired domain.
*.rafay.example.com
If wildcard DNS is not available, create these individual records:
api.<rafay.example.com>
console.<rafay.example.com>
fluentd-aggr.<rafay.example.com>
grafana.<rafay.example.com>
kibana.<rafay.example.com>
ops-console.<rafay.example.com>
repo.<rafay.example.com>
*.cdrelay.<rafay.example.com>
*.core-connector.<rafay.example.com>
*.core.<rafay.example.com>
*.connector.infrarelay.<rafay.example.com>
*.user.infrarelay.<rafay.example.com>
*.kubeapi-proxy.<rafay.example.com>
*.user.<rafay.example.com>
Note
DNS records should point to the controller nodes' IP addresses. For external SSL offloading, refer to the SSL Offloading section.
1.3. Additional Requirements¶
- Company logo in PNG format
- Size: Less than 600 KB
- Used for white labeling and branding
- Required for TLS secure communication
- Trusted CA signed wildcard certificate (2048 bit)
- Self-signed certificates can be auto-generated for non-prod environments
- Set
generate-self-signed-certs: true
in config.yaml for auto-generation
1.4. SSL Offloading Configuration (Optional)¶
-
Rafay controller supports SSL offload at load balancer level using ACM/certificates. This would need two load balancers, one for UI FQDNs which requires SSL offload and another for backed FQDNs which requires SSL passthrough.
-
To enable external SSL offloading, the below override-config has to be enabled in config.yaml.
override-config.global.external_lb: true
1.5. DNS Settings for Using External SSL Offload (Optional)¶
For extended security, all Rafay backend endpoints use mTLS and do not support SSL offloading, except for the frontend UI endpoints.
Frontend FQDNs (Point to Classic Load Balancer for SSL Offloading)¶
api.<rafay.example.com>
console.<rafay.example.com>
fluentd-aggr.<rafay.example.com>
ops-console.<rafay.example.com>
grafana.<rafay.example.com>
repo.<rafay.example.com>
Backend FQDNs (Point to NLB for mTLS)¶
registry.<rafay.example.com>
*.core-connector.<rafay.example.com>
*.core.<rafay.example.com>
*.kubeapi-proxy.<rafay.example.com>
*.user.<rafay.example.com>
*.cdrelay.<rafay.example.com>
*.infrarelay.<rafay.example.com>
*.connector.infrarelay.<rafay.example.com>
*.user.infrarelay.<rafay.example.com>
1.6. Load Balancer Setup (Optional)¶
- Requires two load balancers:
- Load balancer with certificate for SSL offloading in UI traffic.
- Load balancer with SSL passthrough for mTLS traffic
- Enable with:
override-config.global.external_lb: true
in config.yaml
Certificate Requirements:
- CA signed wildcard certificate
- Ports: 80/TCP and 443/TCP inbound
- Redirecting Connections as per the below table
Port Configuration:
Frontend Port | Frontend Protocol | Backend Port | Backend Protocol |
---|---|---|---|
80 | HTTP | 30426 | HTTP |
443 | SECURE TCP(SSL) | 30726 | TCP |
SSL Passthrough Configuration:
Frontend Port | Frontend Protocol | Backend Port | Backend Protocol |
---|---|---|---|
443 | TCP | 30526 | TCP |
Ping Protocol: HTTP
Ping Port: 30326
Ping Path: /healthz/ready
2 Installation Process¶
2.1. Initial Setup¶
- Create instances according to specifications in Prerequisites
- Configure DNS entries for controller domains
- Generate wildcard certificates (optional)
2.2. Controller Installation¶
-
Download the air-gapped setup package using the URL provided by the support team.
wget <URL_of_airgap_installation_package>
Info
The air-gapped package is around 30 GB and may take ~15 minutes to download with
wget
.
For faster downloads, usearia2c
, which supports parallel connections:This can significantly reduce download time by using up to 16 connections.time aria2c -x 16 <URL_of_airgap_installation_package>
-
Validate the package checksum using
md5sum
to ensure the integrity of the downloaded file. The checksum value will be included in the documentation or shared by the support team for comparison.md5sum <name-of-downloaded-package>.tar.gz
-
Extract the package:
For a detailed breakdown of the files included in this package, refer to the Controller Package Contents.tar -xf <name-of-downloaded-package>.tar.gz
-
Set up configuration:
sudo mv ./radm /usr/bin/ cp -rp config.yaml-airgap-tmpl config.yaml vi config.yaml
-
Configure mandatory fields in
config.yaml
:
spec:
deployment:
ha: true # set to true for HA controller
repo:
archive-directory: /path/to/tar/location
unarchive-path: /tmp # where to untar
app-config:
generate-self-signed-certs: true # if using self-signed certificates
partner:
star-domain: "*.example.com"
2.3. Controller Initialization¶
About radm
radm
is a Go-based CLI tool used to manage the full lifecycle of a Rafay air-gapped controller. It handles tasks such as installing infrastructure add-ons, Kubernetes cluster creation, software provisioning, and ongoing maintenance of the controller like config updates, upgrades etc of the controller.Using simple commands, radm takes care of all the heavy lifting internally, making complex operations seamless.
sudo radm init --config config.yaml
-
Initialize first node:
sudo radm init --config config.yaml
-
Join additional control plane nodes:
sudo radm join <master-ip>:6443 --token <token> \ --discovery-token-ca-cert-hash <hash> \ --control-plane --certificate-key <key> --config config.yaml
-
Join worker nodes:
sudo radm join <master-ip>:6443 --token <token> \ --discovery-token-ca-cert-hash <hash> --config config.yaml
2.4. Common Setup Steps (Applicable to Both Single Node and HA Setup)¶
Info
After each radm
command is successfully executed, the CLI will print clear instructions to the console indicating the next steps in the installation process. This output is self-sufficient and acts as a guide, helping you proceed confidently without needing to refer back to the documentation for every step.
-
Configure kubeconfig:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) -R $HOME/.kube
-
Verify Kubernetes node and system pod status and all should be in running status:
kubectl get nodes kubectl get pods -n kube-system kubectl get pods -n openebs
-
Initialize Dependencies
In this step, all the necessary dependencies for Rafay applications are installed. These dependencies enable various core services and functionalities across the platform. This includes essential infrastructure components such as:
cert-manager, Metric Server, Kafka, Postgres Operator, Elasticsearch, Istio, HAProxy, ClickHouse, MinIO, and other supporting add-ons required by Rafay services.
Note
This step will take approximately 15 to 16 minutes to complete, as multiple components are being deployed and initialized.
sudo radm dependency --config config.yaml
-
Install Rafay application:
Once the dependencies are initialized, proceed to install the Rafay platform services.
sudo radm application --config config.yaml
Installation Time
Allow 20 minutes for all pods to become ready. You can monitor pod status in the rafay-core namespace:
kubectl get pods -n rafay-core
3. Accessing the Controller¶
- Access the UI at:
https://console.<your-domain>
- You can create the first organization in one of two ways:
- Click "Sign Up" on the main console (
https://console.<your-domain>
) - Or use the Operations Console at
https://ops-console.<your-domain>
using thesuper-user
credentials set inconfig.yaml
to create the organization and user.
- Click "Sign Up" on the main console (
- When creating the organization, provide the following details:
- Organization Name
- Username / Email
- Password
- After creating the organization and user, log in using the newly created credentials.
4. Additional Configuration¶
Info
If you plan to create or manage downstream clusters (EKS, MKS, GKE & Import) from this controller, don’t forget to run the below Cluster Dependencies
Step.
4.1. Cluster Dependencies¶
Upload cluster images and manifests to the built-in Nexus registry using the radm command below. This will push the required images, packs, and manifests to the built-in Nexus registry.These images and manifests will be used when creating or managing clusters with this air-gapped controller.
sudo radm cluster --config config.yaml
4.2. Multiple Interface Support (Optional)¶
Rafay Controller supports multiple interfaces, configurable via config.yaml. By default, the primary interface is used for all Kubernetes and Rafay app connections
Configure network interface in config.yaml
:
spec:
networking:
interface: ens3
For complete interface isolation, add routing rules:
ip route add 10.96.0.0/12 dev <secondary-interface>
ip route add 10.224.0.0/16 dev <secondary-interface>
4.3. Cost Visibility (Optional)¶
Rafay Controller supports integrated cost visibility. For self-hosted setups, an external InfluxDB is required. Use the provided steps to deploy it on a single-node instance (min: 16 CPU, 32GB RAM, 200GB disk) and connect it
Enable cost metrics in config.yaml
:
cost_metrics:
enabled: false
Note
Requires pre-installed external InfluxDB with minimum 16 CPU, 32GB memory & 200GB disk.