Requirements
This document outlines the hardware, software, and networking requirements for deploying and operating the "VM as a Service" offering based on Rafay GPU PaaS on bare metal infrastructure. It covers prerequisites, infrastructure components, third-party dependencies, controller/agent needs, and day-2 operations.
Bare Metal Servers¶
Minimum Resources¶
Each server must meet minimum thresholds for CPU, Memory and GPU (for GPU workloads)
Storage¶
Network storage is recommended to ensure that server hardware failures do not result in data loss associated with VMs
Operating System¶
Ubuntu 22.04 LTS and 24.04 LTS pre-installed and accessible on each server
Network¶
All servers must reside on a management (mgmt) network. SSH access from the Rafay Controller to each server is mandatory and required.
SSH Access¶
SSH credentials for bare metal OS access must be provided to Rafay Controller. This is used by Rafay to deploy and manage VMs on the servers
Rafay Controller Accessibility¶
Ensure bare metal servers are reachable from the Rafay Controller. It is strongly recommended that providers deploy the Rafay Controller on the same management network.
Infrastructure Components¶
GPU-Capable Bare Metal Pool¶
This will host end-user virtual machines (VMs). This pool must have access to:
- Mounted OS image storage
- Mounted VM block storage
- High-speed, low-latency network for data paths
Non-GPU Bare Metal Server¶
This is used for network virtualization and routing per VPC (e.g., per-tenant VRF and N-S / E-W traffic routing)
Network Interfaces
- Interface 1: Mgmt network (control and monitoring)
- Interface 2: Internet Gateway access for North-South traffic
- Interface 3: Connects to tenantโs East-West network segment
Storage Requirements¶
OS Image Storage¶
The purpose of this is to host disk images that will be used for VM provisioning.
Requirements
- Size: > 1TB
- Type: NFS (high-speed/low-latency) or CephFS
Info
This needs to be mounted on all bare metal servers in the GPU pool
VM Block Storage¶
The purpose of this to provide persistent block volumes for user VMs
Requirements
- Type: NFS or Ceph (RBD)
- Size: Scalable, typically > 500TB
Info
This must be reachable via high-speed network from all participating servers
Network Pool Configuration¶
Public IP Pool¶
The purpose is to dynamically assign public IPs to tenant VMs
Requirements
- At least 1 public IP per tenant/org
- Optional: Consideration for per-VM public IPs
Info
Specify where these pools are configured in the controller UI
VLAN Pool¶
The purpose is to provide isolated Layer-2 domains for tenants
Constraints
- Max: 4096 VLANs
- Reserved: 100 VLANs for internal/system use
- Available for tenants: ~3900 VLANs
- Minimum: 1 VLAN per tenant/org
!! info Configure as part of network provisioning policies