Supported Environments
Please review the information listed below to understand the supported environments and operational requirements.
Operating Systems¶
Here is the list of supported operating systems
Operating System | Master (Control Plane) | Worker Nodes |
---|---|---|
Almalinux 9 (64-bit) | YES | YES |
RHEL 7.x (64-bit) | YES | YES |
RHEL 8.x (64-bit) | YES | YES |
RHEL 9.1 & 9.2 (64-bit) | YES | YES |
Rocky Linux 9 (64-bit) | YES | YES |
Ubuntu 20.04 LTS (64-bit) | YES | YES |
Ubuntu 22.04 LTS (64-bit) | YES | YES |
Ubuntu 24.04 LTS (64-bit) | YES | YES |
Windows Server 2019 (64-bit) | NO | YES |
Windows Server 2022 (64-bit) | NO | YES |
Explore our blog for deeper insights on Upstream Kubernetes on Rocky Linux 9 OS, available here!
Important
- Windows worker nodes require a minimum version of Kubernetes (v1.23.x or higher) and the Calico CNI
- Users cannot provision a cluster with windows worker node while using Canal CNI and Cilium CNI
Hypervisors¶
For VM based deployments, Rafay's MKS distribution based on Upstream Kubernetes is agnostic of the underlying hypervisor. For data center environments, cluster provisioning and ongoing lifecycle management has been validated with the following hypervisors and HCI.
- VMWare vSphere (v7.x and v8.x)
- Microsoft Hyper-v
- Nutanix AOS (v6.5.x LTS, v6.8.x)
- OpenStack (2023.1 Antelope, 2023.2 Bobcat, 2024.1 Caracal)
- VirtualBox (v7.0.x, v6.1.x)
Kubernetes Versions¶
The following versions of Kubernetes are currently supported. New clusters can be provisioned using the following Kubernetes versions.
- Four versions of Kubernetes are supported at any given time.
- Once a new version of Kubernetes is added, support for the oldest version is removed.
- Customers are strongly recommended to upgrade their clusters to a supported version to ensure they continue to receive patches and security updates.
Kubernetes Version | End of Standard Support | Support added with Controller release version |
---|---|---|
1.31.x | 28 sept 2025 | v2.10 |
1.30.x | 28 Jun 2025 | v2.7 |
v1.29.x | 28 Feb 2025 | v2.4 |
v1.28.x | 28 Oct 2024 | v2.0 |
v1.27.x (EOL) | 28 Jun 2024 | v1.27 |
v1.26.x (EOL) | 28 Feb 2024 | v1.25 |
v1.25.x (EOL) | 27 Oct 2023 | v1.19 |
v1.24.x (EOL) | 28 Jul 2023 | v1.15 |
v1.23.x (EOL) | 28 Feb 2023 | v1.11 |
Important
During the cluster upgrade from version 1.29 to 1.30.1 (latest version of 1.30.x), the job that performs health checks on pods is scheduled on Windows nodes. Typically, this job is meant for Linux nodes only. This behavior is due to an upstream issue. To ensure a smooth upgrade process, simply drain the Windows nodes before initiating the cluster upgrade. To drain the Windows nodes, use the kubectl command: kubectl drain <windows_node> --ignore-daemonsets
Explore our blog for deeper insights on Kubernetes v1.30 for Rafay MKS, available here!
Node Management and Cluster Upgrade Guidelines¶
Prerequisites¶
- Master Nodes: OCI instances with 8 OCPUs (16 vCPUs) and 32 GB memory
- Worker Nodes: OCI instances with 1 OCPU (2 vCPUs) and 4 GB memory
Recommendations¶
- Node Provisioning and Addition: Provision or add nodes in batches of up to 100 nodes at a time
- Node Deletion: Perform node deletion in batches of up to 100 nodes at a time
- Cluster Upgrade: When some nodes do not upgrade initially, the retry mechanism will successfully upgrade the remaining nodes
Important
The system has been successfully qualified to support 500 nodes with 10,000 pods, and it operates effectively without reaching the limits.
Container Networking (CNI)¶
The following CNIs are supported for Upstream Kubernetes on bare metal and VM based environments.
CNI | Description |
---|---|
Cilium | Recommended for Linux nodes |
Calico | Recommended for both Linux and Windows nodes |
Canal | Calico + Flannel |
Flannel | Deprecated. Not recommended for new clusters |
CPU and Memory¶
Architecture¶
- The control plane (aka. master) needs to be "Linux/x64" based architecture.
- The nodes (aka. workers) can be "Linux/x64" or "Linux/arm64" or "Windows/x64" based architecture.
Minimal Blueprint¶
The minimum resource requirements for a single node, converged cluster with the "minimal" cluster blueprint are the following:
Resource | Minimum |
---|---|
vCPUs per Node | Two (2) |
Memory per Node | Four (4) GB |
default-upstream blueprint¶
The minimum resource requirements for a single node cluster with the "default-upstream" cluster blueprint are the following:
Single Node Cluster
Resource | Minimum | Cores |
---|---|---|
vCPUs per Node | Two (2) | 4 (four) |
Memory per Node | Sixteen (16) GB | NA |
HA Cluster
Resource | Minimum | Cores |
---|---|---|
vCPUs per Node | Two (2) | 4 (four) |
Memory per Node | Sixteen (16) GB | NA |
Important
- Ensure you provision additional resources if you wish to update/deploy other types of blueprints that will deploy additional software on the cluster such as monitoring, storage, etc.
- To change the blueprint from "default-upstream" to another blueprint after cluster provisioning, users must delete the workload deployments and workload PVCs.
GPU¶
Nvidia GPUs compatible with Kubernetes are supported. Follow these instructions if your workloads require GPUs.
Container Runtime¶
Starting k8s v1.20.x, support for Dockershim has been removed. New clusters will be provisioned with the containerd CRI. When older versions of k8s are upgraded in-place, they will also be upgraded to use the containerd CRI. Customers should therefore account for their k8s resources being restarted.
"containerd" is a container runtime that implements the CRI spec. It pulls images from registries, manages them and then hands over to a lower-level runtime, which actually creates and runs the container processes. Containerd was separated out of the Docker project, to make Docker more modular.
Inter-Node Networking¶
For multi node clusters, ensure that the nodes are configured to communicate with each other over all UDP/TCP ports.
Network Rules: Control Plane¶
Ensure that network rules on the control plane (aka. master) nodes are configured for the ports and direction described below.
Protocol | Direction | Port Range | Purpose |
---|---|---|---|
TCP | Inbound | 6443 | k8s API Server |
TCP | Inbound | 2379-2380 | etcd Client API |
TCP | Inbound | 10250, 10255 | kubelet API |
TCP | Inbound | 10259, 10251 | kube-scheduler |
TCP | Inbound | 10257, 10252 | kube-controller-manager |
UDP | Inbound | 8285 | Flannel CNI |
TCP | Inbound | 30000-32767 | If nodePort needs to be exposed on control plane |
TCP | Inbound | 9099 | Calico CNI |
TCP | Inbound | 5656 | OpenEBS Local PV |
UDP | Inbound | 4789 | vxlan |
Network Rules: Node¶
Ensure that the network rules on the nodes (aka. worker) are configured for the ports and direction described below.
Protocol | Direction | Port Range | Purpose |
---|---|---|---|
TCP | Inbound | 10250, 10255 | Kubelet API |
TCP | Inbound | 30000, 32767 | NodePort Services |
UDP | Inbound | 8285, 8472 | Flannel CNI |
TCP | Inbound | 8500 | Consul |
UDP | Inbound | 8600 | Consul |
TCP/UDP | Inbound | 8301 | Consul |
TCP | Inbound | 9099 | Calico CNI |
TCP | Inbound | 5656 | OpenEBS Local PV |
UDP | Inbound | 4789 | vxlan |
Forward Proxy¶
Enable and configure this setting if your instances are not allowed direct connectivity to the controller and all requests have to be forwarded by a non-transparent proxy server.
Storage¶
Multiple turnkey storage integrations are available as part of the standard cluster infrastructure blueprint. These integrations dramatically simplify and streamline the operational burden associated with provisioning and management of Persistent Volumes (PVs) especially for bare metal and VM based environments.
We have worked to eliminate the underlying configuration and operational complexity associated with storage on Kubernetes. From a cluster administrator perspective, there is nothing to do other than "select" the required option. These turnkey storage integrations also help ensure that stateful workloads can immediately benefit from "dynamically" provisioned PVCs.
Local PV¶
Required/mandatory storage class.
-
Based on OpenEBS for upstream Kubernetes clusters on bare metal and VM based environments.
-
Based on Amazon EBS for upstream Kubernetes clusters provisioned on Amazon EC2 environments. Requires configuration with an appropriate AWS IAM Role for the controller to dynamically provision EBS based PVCs for workloads.
A Local PV is particularly well suited for the following use cases:
-
Stateful workloads that already capable of performing their own replication for HA and basic data protection. This eliminates the need for the underlying storage to copy or replicate the data for these purposes. Good examples are Mongo, Redis, Cassandra and Postgres.
-
Workloads that need very high throughput (e.g. SSDs) from the underlying storage with the guarantee that data consistency on disk
-
Single Node, converged clusters where networked, distributed storage is not available or possible (e.g. developer environments, edge deployments)
Distributed Storage¶
This is optional for customers and based on Rook-Ceph. This option is well suited for environments that need to provide a highly available, shared storage platform. This allows pods to be rescheduled on any worker node on the cluster and still be able to use the underlying PVC transparently.
Important
The GlusterFS based managed storage option was deprecated in Q1 2022 and projected to be EOL in Q1 2023.
Storage Requirements¶
Use the information below to ensure you have provisioned sufficient storage for workloads on your cluster.
Root Disk¶
The root disk for each node is used for the following:
- Docker images (cached for performance)
- Kubernetes data and binaries
- etcd data
- consul data
- system packages
- Logs for components listed above
Logs are automatically rotated using "logrotate". From a storage capacity planning perspective, ensure that you have provisioned sufficient storage in the root disk to accommodate your specific requirements.
- Raw, unformatted
- Min: 50 GB, Recommended: >100 GB
Note
On a single node cluster, a baseline of 30 GB of storage to store logs, images etc is required. The remaining 20 GB will be used for PVCs used by workloads. Allocate and plan for additional storage appropriately for your workloads.
Secondary Disk¶
OPTIONAL and required only if the rook-ceph is selected. This is dedicated and used only for end user workload PVCs
- Raw, unformatted
- Min: 50 GB, Recommended: >100 GB per node