Skip to content

This feature enables one-click deployment on bare metal servers using the non-BCM (Metal3/Ironic) provisioning framework.

Two key user personas are supported:

  • Partner Admin: Sets up the baremetal gateway and provisioner in the datacenter, ensuring Metal3/Ironic services are configured for PXE-based provisioning
  • End User: Selects a predefined compute profile from the Developer Portal and enters minimal details such as ssh-public-key and userdata (both are optional)

This model provides the same simplicity as VM-based GPU PaaS but leverages physical bare metal servers for maximum performance, suitable for large-scale AI training and inference workloads.


Partner Admin

  • Deploys the Baremetal Gateway and Baremetal Provisioner in the datacenter
  • Ensures PXE boot networking is configured and Metal3/Ironic has access to machine BMCs for power and provisioning control
  • Validates prerequisites such as VLAN pools, storage VLAN reachability, and DHCP/TFTP services on the PXE network
  • Publishes bare metal nodes into inventory for use in cluster creation

Step 1: Create a Gateway

Partner Admins must first create a Gateway, which acts as a secure bridge to enable the baremetal provisioner on the head-node in a data center. Baremetal provisioning and future self-service provisioning processes are performed through this Gateway, and any errors encountered during these processes are reported via the same Gateway.

  • Navigate to Infrastructure → Gateway and click New Gateway (in the default project or system catalog)
  • Provide a Name and Description
  • From the Type dropdown, select Baremetal
  • Click Create to save

Baremetal Gateway Setup

Step 2: Run the Setup Command on the Head Node

Once the Baremetal Gateway is created, the platform generates a Setup Command. This command must be executed on the head node of the data center.

  • In the Gateway list, click View Details for your newly created gateway
  • Copy the Setup Command displayed in the popup window

Baremetal Gateway Setup Command

  • Log in to the head node of your data center and run the command.

When executed, the command:

  • Downloads and installs the Infra Agent binary.
  • Runs the agent at the host level on the node.
  • Establishes a secure connection between the head node and the controller.

This ensures that the baremetal environment is registered and ready for cluster provisioning.

Step 3: Baremetal Provisioner Setup

Once the Baremetal Gateway is created, the next step is to create a Baremetal Provisioner in the same project.

The Baremetal tab is available under Infrastructure only when the feature flag is enabled for the organization.

â„šī¸ Note
To access the Baremetal tab, ensure that the Baremetal feature flag is enabled in the Ops Console for the default organization.

  1. Navigate to Infrastructure → Baremetal
  2. Click New Baremetal Provisioner
  3. Fill in the required details in the Create New Baremetal Provisioner form:
Field Description Example
Name Name of the Baremetal Provisioner bm-provisioner-01
Baremetal Gateway Select the Baremetal Gateway created earlier bm-gateway-01
Provisioning Interface Network interface on the gateway used for PXE booting; must be L2-connected eth1
Provisioning IP IP address assigned to the provisioning interface 192.168.110.10
DHCP Range Configure bare metal nodes with a DHCP range (e.g., 192.168.110.100 – 192.168.110.200) and optionally assign an infinite lease time 192.168.110.100,192.168.110.200 (or) 192.168.110.100,192.168.110.200,infinite
DHCP Gateway IP Default gateway IP for the provisioning network 192.168.110.1
DHCP DNS IP (Optional) DNS server used by target machines during PXE provisioning 8.8.8.8
DHCP Hosts (Optional) Static MAC-to-IP mappings (e.g., MAC1,IP1;MAC2,IP2) aa:bb:cc:dd:ee:ff,192.168.110.150;ab:bb:cc:dd:ee:ff,192.168.110.149

Baremetal Gateway Setup Command

Click Create to finish provisioning.

Once the configurations are created, click Provision from the options menu as shown below:

Baremetal Provisioner

  • A lightweight K3s cluster is installed on the head node.
  • The Metal3/Ironic Operator is installed

Tenant Setup and System Profile Sharing

Baremetal Provisioner is a one-time setup for each tenant.

  • Each tenant will have a head node and a set of servers allocated to them.
  • This Baremetal Provisioner is used for provisioning machines for that tenant going forward.

If the provisioning fails, the Status column shows Failed. Click the Failed status to view detailed error information, resolve the issue, and retry the provisioning.

Provisioning Failed

Step 4: Baremetal Inventory Details

After baremetal provisioning is completed, add the inventory details. The inventory provides information about available baremetal machines and their interfaces, which are later used during machine requests and provisioning.

Required Inventory Fields

Field Description Example
Allocation Status Must be set to Available for provisioning. Available
Username / Password Access credentials for the baremetal machine. admin / passw0rd
Power Management (BMC) Credentials for controlling machine power operations. IPMI/Redfish
Interfaces (PXE / Other) At least one PXE boot interface must be defined and labeled bootstrap. Additional interfaces can be labeled for other use cases (e.g., tan, tan_ha). PXE: MAC-1, label=bootstrap
TAN: MAC-2, label=tan
Tags Must match the Compute Profile tags. These enable automatic mapping between the environment template and inventory machines. The same key-value must also be added to compute profile for baremetal provisioning. server_type, L40

â„šī¸ Note: For detailed documentation on configuring BMC, interfaces, allocation status, username/password, refer to the Inventory Devices documentation.

Power Management (BMC)

  • Baremetal hardware must support either IPMI or Redfish protocol for power management.

  • IPMI:

    • Typically used by older servers (e.g., Yotta hardware).
    • Only an IP address is required as the endpoint.
    • Example:
      192.68.0.1
      
  • Redfish:

    • Supported by modern servers (e.g., Dell iDRAC-Redfish).
    • Endpoint format may vary depending on the hardware.
    • Examples:
      idrac-redfish://192.168.1.9/redfish/v1/Systems/System.Embedded.1
      redfish-virtualmedia+http://172.32.0.1:8000/redfish/v1/Systems/gk-bm-3
      
  • Username and password are common for both IPMI and Redfish.

Step 5: System Profile Sharing

Prerequisites

Before sharing the system profile with tenant organizations, complete the following steps:

  1. Load the bmaas EnvironmentTemplate (ET) and ResourceTemplate (RT) in the system-catalog Project of the Default Org
  2. In the Ops Console, under System Resources, the bmaas EnvironmentTemplate (ET) is configured with default values Provisioning Failed
  3. Deploy the GitOps Agent either on the Rafay Controller or in the customer environment
  4. Attach the GitOps Agent to the EnvironmentTemplate (ET) and configure the PARTNER_API_KEY value in the ET
    • Alternatively, configure the GitOps Agent and PARTNER_API_KEY in Global Settings via the Swagger API

Share the Compute Profile(s)

After the pre-setup is complete, the compute profile must be shared with the corresponding tenant organization:

  1. Navigate to Ops Console → System Resources → Compute Profiles
  2. Select the required Profile and share the profile with the target Tenant org(s)
    Provisioning Failed

Provisioning with Tenant Profiles

When launching a profile, the admin can edit the required values before sharing the profile with end users.

Field Description Example
baremetal_provisioner_name Enter the Baremetal Provisioner Name created earlier. This is used to identify the inventory and perform backend provisioning of baremetal machines bm-prov-1
image_checksum Checksum for the image https://dev-rafay-vmware-ova.s3.us-west-1.amazonaws.com/baremetal/ubuntu-24.04-baremetal-final-uefi.qcow2.sha256
image_checksum_type Checksum algorithm for the image, e.g. md5, sha256, or sha512. The special value auto can be used to detect the algorithm from the checksum. If missing, MD5 is used. If in doubt, use auto sha256
image_format Format of the image (raw, qcow2) qcow2
image_url Location (URL) of the image to deploy https://dev-rafay-vmware-ova.s3.us-west-1.amazonaws.com/baremetal/ubuntu-24.04-baremetal-final-uefi.qcow2
system_user_data A cloud-init configuration in YAML format following the cloud-config schema. By default, a bond network interface configuration template with variables is provided. If a bond interface is not required for the baremetal setup, this parameter must be set to empty. Administrators can override the template variables with hardcoded values or configure them at the inventory device or switch level as needed (Refer Inventory Devices documentation). They may also override or extend this configuration with any other cloud-init script adhering to the cloud-config schema. #cloud-config spec (default)
tags Must match the Inventory Device tags. These enable automatic mapping between the environment template and inventory machines Tag as key-value pair in JSON format: { "server_type": "L40" }

Provisioning Failed

â„šī¸ Important: - system_user_data allows adding a cloud-init configuration in YAML format, adhering to the cloud-config schema
- This configuration should be added only when a network bond interface configuration is required
- If no bond interface is needed, this section can be skipped

  • Values for placeholders can be added in the Inventory pages
  • Servers → Interfaces Provisioning Failed

  • Switches → VRFs) Provisioning Failed

  • Defined directly in the System User Data config

```

cloud-config

write_files: - path: /etc/netplan/60-bond-config.yaml permissions: '0600' owner: root:root content: | network: version: 2 ethernets: {{ .TANInterface1 }}: {} {{ .TANInterface2 }}: {}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
   bonds:
       bond0:
           interfaces:
           - {{ .TANInterface1 }}
           - {{ .TANInterface2 }}
           addresses:
           - {{ .TANIPAddress }}/{{ .SubnetMask }}
           nameservers:
               addresses:
                   - 8.8.8.8
               search: []
           routes:
           -   to: default
               via: {{ .GatewayIP }}
           parameters:
               lacp-rate: fast
               mode: 802.3ad
               transmit-hash-policy: layer2+3
           mtu: 9000

runcmd: - netplan apply ```

Provisioning Failed

Tenant Profiles enable the following

  • Define compute profiles tied to available baremetal resources.
  • Configure options such as GPU type, CPU/memory sizing, and storage.
  • Simplify deployment by exposing only essential parameters (e.g., cluster name, node count).
  • Optionally allow Overrides in input settings so end users can customize values during cluster deployment.
  • Publish profiles to one or more projects for selection during cluster creation.

End User

  • Navigate to Developer Hub → Compute → Baremetal
  • Select the relevant baremetal compute profile shared by the PaaS Admin

Cluster Deployment

  • Provide required details such as:
    • Name (mandatory)
    • Description (optional)
    • SSH Public Key (optional): SSH public key for the baremetal machine (used for login)
    • User Data (optional): A cloud-init configuration in YAML format, adhering to the cloud-config schema
    • Editable inputs as permitted by the Admin

End User Inputs

  • Click Deploy to trigger automated provisioning of the baremetal server
  • After successful provisioning, host (baremetal machine) details such as Hostname, PXE_IP, PXE_MAC, Username, and Password are displayed in the console
  • The end user can log in to the machine in either of the following ways:
    • Using the SSH private key corresponding to the public key configured during provisioning:
ssh -i <ssh_private_key> username@<pxe_ip>
  • Using the username and password provided in the host details.

Kubeconfig