Skip to content

Air-Gapped Controller Installation Guide

This guide provides detailed instructions for installing the Rafay Controller in air-gapped environments. Watch the video below for the high level installation steps.


Components

The image below describes the various software components that are automatically installed and configured by Rafay's installer for the air gapped controller. The components span a few layers

  • Rafay Controller Application Layer
  • Observability Layer
  • Software Infrastructure Layer
  • Kubernetes Cluster Layer

Components of Air Gapped Controller

Note

The provisioning and lifecycle management of the underlying VMs or Servers is the responsibility of the operator.


Step-by-Step Installation

sequenceDiagram
    participant User
    participant Node
    participant DNS
    participant RafaySupport

    User->>Node: 1. Verify system meets prerequisites
    User->>DNS: 2. Create internal and external DNS records
    User->>RafaySupport: 3. Request and download package tarball
    User->>Node: 4. Extract (untar) package
    User->>Node: 5. Update config.yaml with required values

    User->>Node: 6. Run `radm init --config config.yaml`
    Note right of Node: ⚠️ INIT STEP — Initializes Rafay Control Plane<br>Installs Kubernetes layer, storage, and registry
    Node-->>User: Output post-init instructions

    User->>Node: 7. Verify installation (e.g., check pods)

    User->>Node: 8. Run `sudo radm dependency --config config.yaml`
    Note right of Node: ⚠️ INSTALLS DEPENDENCIES — Installs components like cert-manager, Kafka, Istio, MinIO, etc.

    User->>Node: 9. Run `sudo radm application --config config.yaml`
    Note right of Node: ⚠️ DEPLOYS RAFAY SERVICES — Installs controller applications and services

    User->>Node: 10. Access Console UI via console.<domain>
    User->>Node: 11. Sign up with user/org details
    User->>Node: 12. Login to Rafay controller dashboard

    User->>Node: 13. Run `sudo radm cluster --config config.yaml`
    Note right of Node: ⚠️ PUSH CLUSTER ARTIFACTS — Uploads cluster images and assets to Build-in Nexus registry

1. Prerequisites

1.1. Infrastructure Requirements

  • Operating System:

    • Ubuntu 24.04
    • RHEL 8
    • RHEL 9
  • Instance Requirements:

    • Single Node Controller: 1 node
    • High Availability Controller: 3 master nodes
  • System Size (Minimum):

    • 'S': 16 CPU, 64GB memory (Non-HA)
    • 'M': 32 CPU, 64GB memory
    • 'L': 64 CPU, 128GB memory
  • Root Disk: Minimum 500 GB
  • Temp Directory (/tmp): Minimum 50GB (if not part of root disk)
  • Data Disk: 1 TB (mounted as /data volume, size varies based on storage requirements)
  • RHEL installations need connectivity to default repository servers
  • Inbound port 443/tcp must be allowed to all instances
  • All localhost ports must be reachable
  • Port 30053/UDP must be reachable in non-DNS environments
  • SELinux/firewall must be disabled on all nodes

1.2. DNS Configuration

DNS records are required for the controller to function properly. Replace rafay.example.com with your desired domain.

*.rafay.example.com

If wildcard DNS is not available, create these individual records:

api.<rafay.example.com>
console.<rafay.example.com>
fluentd-aggr.<rafay.example.com>
grafana.<rafay.example.com>
kibana.<rafay.example.com>
ops-console.<rafay.example.com>
repo.<rafay.example.com>
*.cdrelay.<rafay.example.com>
*.core-connector.<rafay.example.com>
*.core.<rafay.example.com>
*.connector.infrarelay.<rafay.example.com>
*.user.infrarelay.<rafay.example.com>
*.kubeapi-proxy.<rafay.example.com>
*.user.<rafay.example.com>

Note

DNS records should point to the controller nodes' IP addresses. For external SSL offloading, refer to the SSL Offloading section.


1.3. Additional Requirements

  • Company logo in PNG format
  • Size: Less than 600 KB
  • Used for white labeling and branding
  • Required for TLS secure communication
  • Trusted CA signed wildcard certificate (2048 bit)
  • Self-signed certificates can be auto-generated for non-prod environments
  • Set generate-self-signed-certs: true in config.yaml for auto-generation

1.4. SSL Offloading Configuration (Optional)

  • Rafay controller supports SSL offload at load balancer level using ACM/certificates. This would need two load balancers, one for UI FQDNs which requires SSL offload and another for backed FQDNs which requires SSL passthrough.

  • To enable external SSL offloading, the below override-config has to be enabled in config.yaml.

    override-config.global.external_lb: true


1.5. DNS Settings for Using External SSL Offload (Optional)

For extended security, all Rafay backend endpoints use mTLS and do not support SSL offloading, except for the frontend UI endpoints.

Frontend FQDNs (Point to Classic Load Balancer for SSL Offloading)

  • api.<rafay.example.com>
  • console.<rafay.example.com>
  • fluentd-aggr.<rafay.example.com>
  • ops-console.<rafay.example.com>
  • grafana.<rafay.example.com>
  • repo.<rafay.example.com>

Backend FQDNs (Point to NLB for mTLS)

  • registry.<rafay.example.com>
  • *.core-connector.<rafay.example.com>
  • *.core.<rafay.example.com>
  • *.kubeapi-proxy.<rafay.example.com>
  • *.user.<rafay.example.com>
  • *.cdrelay.<rafay.example.com>
  • *.infrarelay.<rafay.example.com>
  • *.connector.infrarelay.<rafay.example.com>
  • *.user.infrarelay.<rafay.example.com>

1.6. Load Balancer Setup (Optional)

  • Requires two load balancers:
    1. Load balancer with certificate for SSL offloading in UI traffic.
    2. Load balancer with SSL passthrough for mTLS traffic
  • Enable with: override-config.global.external_lb: true in config.yaml

Certificate Requirements:

  • CA signed wildcard certificate
  • Ports: 80/TCP and 443/TCP inbound
  • Redirecting Connections as per the below table

Port Configuration:

Frontend Port Frontend Protocol Backend Port Backend Protocol
80 HTTP 30426 HTTP
443 SECURE TCP(SSL) 30726 TCP

SSL Passthrough Configuration:

Frontend Port Frontend Protocol Backend Port Backend Protocol
443 TCP 30526 TCP
Ping Protocol: HTTP
Ping Port: 30326
Ping Path: /healthz/ready

2 Installation Process

2.1. Initial Setup

  1. Create instances according to specifications in Prerequisites
  2. Configure DNS entries for controller domains
  3. Generate wildcard certificates (optional)

2.2. Controller Installation

  1. Download the air-gapped setup package using the URL provided by the support team.

    wget <URL_of_airgap_installation_package>
    

    Info

    The air-gapped package is around 30 GB and may take ~15 minutes to download with wget.
    For faster downloads, use aria2c, which supports parallel connections:

    time aria2c -x 16 <URL_of_airgap_installation_package>
    
    This can significantly reduce download time by using up to 16 connections.

  2. Validate the package checksum using md5sum to ensure the integrity of the downloaded file. The checksum value will be included in the documentation or shared by the support team for comparison.

    md5sum <name-of-downloaded-package>.tar.gz
    

  3. Extract the package:

    tar -xf <name-of-downloaded-package>.tar.gz
    
    For a detailed breakdown of the files included in this package, refer to the Controller Package Contents.

  4. Set up configuration:

    sudo mv ./radm /usr/bin/
    cp -rp config.yaml-airgap-tmpl config.yaml
    vi config.yaml
    

  5. Configure mandatory fields in config.yaml:

spec:
  deployment:
    ha: true  # set to true for HA controller
  repo:
    archive-directory: /path/to/tar/location
    unarchive-path: /tmp # where to untar
  app-config:
    generate-self-signed-certs: true  # if using self-signed certificates
    partner:
      star-domain: "*.example.com"

2.3. Controller Initialization

About radm

radm is a Go-based CLI tool used to manage the full lifecycle of a Rafay air-gapped controller. It handles tasks such as installing infrastructure add-ons, Kubernetes cluster creation, software provisioning, and ongoing maintenance of the controller like config updates, upgrades etc of the controller.Using simple commands, radm takes care of all the heavy lifting internally, making complex operations seamless.

sudo radm init --config config.yaml
  1. Initialize first node:

    sudo radm init --config config.yaml
    

  2. Join additional control plane nodes:

    sudo radm join <master-ip>:6443 --token <token> \
      --discovery-token-ca-cert-hash <hash> \
      --control-plane --certificate-key <key> --config config.yaml
    

  3. Join worker nodes:

    sudo radm join <master-ip>:6443 --token <token> \
      --discovery-token-ca-cert-hash <hash> --config config.yaml
    


2.4. Common Setup Steps (Applicable to Both Single Node and HA Setup)

Info

After each radm command is successfully executed, the CLI will print clear instructions to the console indicating the next steps in the installation process. This output is self-sufficient and acts as a guide, helping you proceed confidently without needing to refer back to the documentation for every step.

  1. Configure kubeconfig:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) -R $HOME/.kube
    

  2. Verify Kubernetes node and system pod status and all should be in running status:

    kubectl get nodes
    kubectl get pods -n kube-system
    kubectl get pods -n openebs
    

  3. Initialize Dependencies

    In this step, all the necessary dependencies for Rafay applications are installed. These dependencies enable various core services and functionalities across the platform. This includes essential infrastructure components such as:

    cert-manager, Metric Server, Kafka, Postgres Operator, Elasticsearch, Istio, HAProxy, ClickHouse, MinIO, and other supporting add-ons required by Rafay services.

    Note

    This step will take approximately 15 to 16 minutes to complete, as multiple components are being deployed and initialized.

    sudo radm dependency --config config.yaml
    
  4. Install Rafay application:

    Once the dependencies are initialized, proceed to install the Rafay platform services.

    sudo radm application --config config.yaml
    

Installation Time

Allow 20 minutes for all pods to become ready. You can monitor pod status in the rafay-core namespace:

kubectl get pods -n rafay-core


3. Accessing the Controller

  1. Access the UI at: https://console.<your-domain>
  2. You can create the first organization in one of two ways:
    • Click "Sign Up" on the main console (https://console.<your-domain>)
    • Or use the Operations Console at https://ops-console.<your-domain> using the super-user credentials set in config.yaml to create the organization and user.
  3. When creating the organization, provide the following details:
    • Organization Name
    • Username / Email
    • Password
  4. After creating the organization and user, log in using the newly created credentials.

4. Additional Configuration

Info

If you plan to create or manage downstream clusters (EKS, MKS, GKE & Import) from this controller, don’t forget to run the below Cluster Dependencies Step.


4.1. Cluster Dependencies

Upload cluster images and manifests to the built-in Nexus registry using the radm command below. This will push the required images, packs, and manifests to the built-in Nexus registry.These images and manifests will be used when creating or managing clusters with this air-gapped controller.

sudo radm cluster --config config.yaml

4.2. Multiple Interface Support (Optional)

Rafay Controller supports multiple interfaces, configurable via config.yaml. By default, the primary interface is used for all Kubernetes and Rafay app connections

Configure network interface in config.yaml:

spec:
  networking:
    interface: ens3

For complete interface isolation, add routing rules:

ip route add 10.96.0.0/12 dev <secondary-interface>
ip route add 10.224.0.0/16 dev <secondary-interface>


4.3. Cost Visibility (Optional)

Rafay Controller supports integrated cost visibility. For self-hosted setups, an external InfluxDB is required. Use the provided steps to deploy it on a single-node instance (min: 16 CPU, 32GB RAM, 200GB disk) and connect it

Enable cost metrics in config.yaml:

cost_metrics:
  enabled: false

Note

Requires pre-installed external InfluxDB with minimum 16 CPU, 32GB memory & 200GB disk.