Skip to content

Requirements

This document outlines the hardware, software, and networking requirements for deploying and operating the "VM as a Service" offering based on Rafay GPU PaaS on bare metal infrastructure. It covers prerequisites, infrastructure components, third-party dependencies, controller/agent needs, and day-2 operations.


Bare Metal Servers

Minimum Resources

Each server must meet minimum thresholds for CPU, Memory and GPU (for GPU workloads)

Storage

Network storage is recommended to ensure that server hardware failures do not result in data loss associated with VMs

Operating System

Ubuntu 22.04 LTS and 24.04 LTS pre-installed and accessible on each server

Network

All servers must reside on a management (mgmt) network. SSH access from the Rafay Controller to each server is mandatory and required.

SSH Access

SSH credentials for bare metal OS access must be provided to Rafay Controller. This is used by Rafay to deploy and manage VMs on the servers

Rafay Controller Accessibility

Ensure bare metal servers are reachable from the Rafay Controller. It is strongly recommended that providers deploy the Rafay Controller on the same management network.


Infrastructure Components

GPU-Capable Bare Metal Pool

This will host end-user virtual machines (VMs). This pool must have access to:

  • Mounted OS image storage
  • Mounted VM block storage
  • High-speed, low-latency network for data paths

Non-GPU Bare Metal Server

This is used for network virtualization and routing per VPC (e.g., per-tenant VRF and N-S / E-W traffic routing)

Network Interfaces

  • Interface 1: Mgmt network (control and monitoring)
  • Interface 2: Internet Gateway access for North-South traffic
  • Interface 3: Connects to tenantโ€™s East-West network segment

Storage Requirements

OS Image Storage

The purpose of this is to host disk images that will be used for VM provisioning.

Requirements

  • Size: > 1TB
  • Type: NFS (high-speed/low-latency) or CephFS

Info

This needs to be mounted on all bare metal servers in the GPU pool

VM Block Storage

The purpose of this to provide persistent block volumes for user VMs

Requirements

  • Type: NFS or Ceph (RBD)
  • Size: Scalable, typically > 500TB

Info

This must be reachable via high-speed network from all participating servers


Network Pool Configuration

Public IP Pool

The purpose is to dynamically assign public IPs to tenant VMs

Requirements

  • At least 1 public IP per tenant/org
  • Optional: Consideration for per-VM public IPs

Info

Specify where these pools are configured in the controller UI

VLAN Pool

The purpose is to provide isolated Layer-2 domains for tenants

Constraints

  • Max: 4096 VLANs
  • Reserved: 100 VLANs for internal/system use
  • Available for tenants: ~3900 VLANs
  • Minimum: 1 VLAN per tenant/org

!! info Configure as part of network provisioning policies