Skip to content

Lifecycle

Rafay VM Restart

Scenario

A reboot or restart of the qcow image based Rafay VM has occurred due to a power cycle etc. with the underlying hardware

Once the Rafay VM goes down, the Rafay Controller will detect the missed heartbeat. The Rafay Controller will automatically send a notification to Rafay Operations and optionally to other recipients.

Typical Steps

  • Restart the Rafay VM via OpenStack and ensure VM comes up successfully with the previously attached storage volumes etc.
  • The remaining process is a zero touch, hands off process.

The k8s cluster, the Rafay k8s operator (Rafay k8s agent) and all configured/deployed customer workloads will become automatically operational. No manual intervention is required by the customer or the infrastructure administrator.

Once the Rafay k8s Operator becomes operational, it will reestablish connectivity to the Rafay Controller, report its status/health, retrieve new instructions if any. If required, the infrastructure administrator can login into the Rafay Console to double check the cluster’s health.

Required Time

Assuming there are no issues with the hardware, storage or network, the Rafay VM will become operational with the core k8s services and Rafay k8s operator in approximately 2-5 mins.


Server Hardware Failure

Scenario

The server hardware has to be replaced because of hardware failure. This will require the Rafay VM to be recreated on a replacement server.

Typical Steps

Let's assume that the customer's expectation is that the replacement k8s cluster needs to be created with the “same name” as the older one. For example, assume the old cluster’s name was "Acme" with "Location = Boston" operating in the Rafay VM in the failed server.

  • Delete the "Acme" cluster in Rafay Controller.
  • Create a new cluster with the same name i.e. "Acme" in the Rafay Controller.
  • Follow the instructions for provisioning a new cluster
  • Once the cluster is operational and healthy, notify the end user to republish their workload to the replacement cluster.

Server Hardware Upgrade

Scenario

The server hardware has to be upgraded and replaced. This requires the Rafay VM to be created on the replacement server.

Typical Steps

Let's assume that the customer's expectation is that the replacement k8s cluster needs to be created with the “same name” as the older one. For example, assume the old cluster’s name was "Beta" with "Location = Chicago" operating in the Rafay VM in the older server slated for replacement.

  • Delete "Beta" cluster in Rafay Controller.
  • Create a new cluster with the same name i.e. "Beta" in the Rafay Controller.
  • Follow the instructions for provisioning the new cluster
  • Once the cluster is operational and healthy, notify the end user to republish their workload to the replacement cluster.