Skip to content

Automation

For purposes of automation, it is strongly recommended that users create and manage Fleet Plans and manage various operations of clusters. This is well suited for scenarios where the multiple cluster requires a set of similar operations

A Fleet Plan can be created and managed via two automation methods:

  • RCTL CLI
  • Swagger API

Fleet Plan Automation Lifecycle

Create and Execute Fleet Plan

Command to create and run the Fleet Plan

./rctl apply -f <fleetplan_filename.yaml>

Below is an example of yaml file to create a Fleet Plan named demo_fleetplan with the configuration details provided in Step 1

kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
  name: demo_fleetplan
  project: fleetpro2
spec:
  fleet:
    kind: clusters
    labels:
      role: qa
      user: demo_user
    projects:
      - name: fleetpro2
      - name: fleetproj4
      - name: fleetpro
      - name: proj3
  operationWorkflow:
    operations:
      - name: operation-fleetclusters
        prehooks:
          - description: precheck for operation1
            inject:
              - KUBECONFIG
            name: executekubent
            containerConfig:
              runner: agent
              image: ghcr.io/doitintl/kube-no-trouble:latest
              arguments:
                - '-o'
                - json
                - '-e'
              commands:
                - /app/kubent
            timeoutSeconds: 120
            successCondition: 'if #status.container.exitCode == 0 { success: false }'
        action:
          type: patch
          description: set max nodes count
          name: setmaxnodecount
          patchConfig:
            - op: replace
              path: .spec.config.managedNodeGroups[0].maxSize
              value: 33
          continueOnFailure: true
        posthooks:
          - description: post check after operation
            inject:
              - KUBECONFIG
            name: executeposthook
            containerConfig:
              runner: agent
              image: ghcr.io/doitintl/kube-no-trouble:latest
              arguments:
                - '-o'
                - json
                - '-e'
              commands:
                - /app/kubent
            timeoutSeconds: 120
            successCondition: 'if #status.container.exitCode != 0 { failed: true }'
  agents:
    - name: gitops-agent1
    - name: gitops-agent2

Another example provided below illustrates the creation of a Fleet Plan with a GPU-based posthook and actions to update the blueprint.

kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
  name: fleet-eks-update-bp-version
  project: demofleet
spec:
  fleet:
    kind: clusters
    labels:
      cluster-type: eks
      env: eks-demo
    projects:
      - name: demofleet
      - name: platform
  operationWorkflow:
    operations:
      - name: update-bp-version
        prehooks:
          - description: list-all-pods
            inject:
              - KUBECONFIG
            name: list-all-pods
            containerConfig:
              runner: cluster
              image: alpine/k8s:1.24.16
              arguments:
                - -c
                - kubectl get pod -A -o wide
              commands:
                - /bin/sh
            timeoutSeconds: 120
            successCondition: 'if #status.container.exitCode == 0 { success: false }'
        action:
          type: patch
          description: update-bp-version
          name: update-bp-version
          patchConfig:
            - op: replace
              path: .spec.blueprintConfig.name
              value: eks-standard-bp
            - op: replace
              path: .spec.blueprintConfig.version
              value: v1.1
          continueOnFailure: true
        posthooks:
          - description: re-check-list-all-pods
            inject:
              - KUBECONFIG
            name: re-check-list-all-pods
            containerConfig:
              runner: cluster
              image: alpine/k8s:1.24.16
              arguments:
                - -c
                - kubectl get pod -A -o wide
              commands:
                - /bin/sh
          - description: gpu benchmarks
            inject:
              - KUBECONFIG
            name: gpu-benchmark
            containerConfig:
              runner: cluster
              image: cemizm/tf-benchmark-gpu
            timeoutSeconds: 120
            successCondition: 'if #status.container.exitCode != 0 { failed: true }'
  agents:
    - name: demofleet-gitops-agents

The below example illustrates the creation of a Fleet Plan with HTTP Config Type prehook.

kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
  name: demo-fp
  project: defaultproject
spec:
  fleet:
    kind: clusters
    labels:
      key: value1
    projects:
      - name: defaultproject
  operationWorkflow:
    operations:
      - name: op1
        prehooks:
          - description: Adding HTTP API
            name: prehook2
            httpConfig:
              method: GET
              endpoint: 'https://google.com'
              headers:
                agent: chrome
            timeoutSeconds: 10
            successCondition: |
              if #status.container.exitCode == 0 {
                success: true
              }
              if #status.container.exitCode != 0  {
                failed: true
              }
        action:
          type: patch
          name: action1
          description: updating nodeGroups
          patchConfig:
            - op: replace
              path: '.spec.config.nodeGroups[0].desiredCapacity'
              value: 2
          continueOnFailure: true    

Here is an example of a fleet plan with an HTTP prehook, which will make an API call to retrieve the edge response. The SuccessCondition is configured such that if the health status of the cluster is determined as healthy, then the prehook will be marked as successful, enabling the Cluster Upgrade action to proceed.

kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
  name: demo-fleetplan
  project: demo-project1
spec:
  fleet:
    kind: clusters
    labels:
      env: stage
      team: cloud
    projects:
      - name: dev-project
      - name: qa-project
  operationWorkflow:
    operations:
      - name: fleet-operation1
        prehooks:
          - inject:
              - KUBECONFIG
            name: http-prehook1
            httpConfig:
              endpoint: >-
                http://qc-console.stage.rafay.dev/edge/v1/projects/pk0nl7m/edges/249e4nk/
              method: GET
              headers:
                Cookie: >-
                  csrftoken=zbZhWc1PlLB8URiYCmM5uXO3pJm98OLH3ETLJJEEvrtRMn3FvWELyThDgxOr6xp8;
                  rsid=yph7k8w5z26k1or7jyhg3otjoihwixis
            timeoutSeconds: 300
            successCondition: "if #status.http.body.health == 1 {\n\tsuccess: true\n}\nif #status.http.body.health != 1 {\n\tfailed: true\n}"
        action:
          type: controlPlaneUpgrade
          name: cp-upgrade
          controlPlaneUpgradeConfig:
            version: '1.28'

Important

Users can leverage the Fleet Plan only with V3 configuration


Execute Fleet Plan

To execute a fleet plan, use the below command

./rctl execute fleetplan <fleetplan_name> --v3

Example

./rctl execute fleetplan rctl-june6th-2ops --v3
{"metadata":{"name":"2023-06-07-12-46-19","createdAt":"2023-06-07T12:46:19Z","modifiedAt":"2023-06-07T12:46:19Z","ID":"dk6vw21"},"fleet_plan_id":"dkgyrkx","workflow_id":"01H2AY9JC8WYPNZJCXHCYJW5M2","state":"pending"}%

Get Fleet Plan

Use this command to retrieve a specific fleet plan detailed information

./rctl get fp <fleetplan_name> --v3

Below is the illustrative example of the "upgrade202023" fleetplan information:

./rctl get fp upgrade202023 --v3
+----------------+-------------------------------------+-----------+
| FLEETPLAN NAME | FLEET LABELS                        | STATUS    |
+----------------+-------------------------------------+-----------+
| upgrade202023  | {"org":"rafay","team":"cloud-engg"} | completed |
+----------------+-------------------------------------+-----------+

You can use the below commands to get more information of the fleetplan in json or yaml format

./rctl get fp <fleetplan_name> --v3 -o json

(or)

./rctl get fp <fleetplan_name> --v3 -o yaml

Example

./rctl get fp upgrade202023 --v3 -o yaml
apiVersion: infra.k8smgmt.io/v3
kind: FleetPlan
metadata:
  name: upgrade202023
  project: fleetproj4
spec:
  fleet:
    kind: clusters
    labels:
      org: rafay
      team: cloud-engg
    projects:
    - name: fleetproj4
    - name: fleet-proj3
    - name: fleetpro
    - name: fleetpro2
  operationWorkflow:
    operations:
    - action:
        description: upgrade ng upgrade
        name: ngk8sgrade
        nodeGroupsUpgradeConfig:
          names:
          - managng1
          - fleet2-selfngroup1
          - ng-30c2bf34
          version: "1.24"
        type: nodeGroupsUpgrade
      name: upgradeks8
status:
  jobStatus:
    lastUpdated: "2023-06-01T05:29:13Z"
    reason: all activities completed
    status: completed
  resourcesStatus:
  - name: fleet02
    operations:
    - action:
        description: upgrade ng upgrade
        lastUpdated: "2023-06-01T05:29:13Z"
        name: ngk8sgrade
        reason: Desired and deployed config are same
        status: success
    project: fleetpro2
  - name: fleet2-eks
    operations:
    - action:
        description: upgrade ng upgrade
        lastUpdated: "2023-06-01T05:29:13Z"
        name: ngk8sgrade
        reason: Desired and deployed config are same
        status: success
    project: fleetpro

Use this command to retrieve/list all the fleetplans

./rctl get fleetplans --v3

An illustrative example given below:

./rctl get fleetplans --v3
+--------------------------+----------------------------------------------+-------------------------+
| FLEETPLAN NAME           | FLEET LABELS                                 | STATUS                  |
+--------------------------+----------------------------------------------+-------------------------+
| upgrade202023            | {"org":"rafay","team":"cloud-engg"}          | completed               |
+--------------------------+----------------------------------------------+-------------------------+
| fleetplan4-upgrade-k8s   | {"org":"rafay","team":"cloud-engg"}          | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| fleetplan-docker         | {"role":"qa","user":"demo-rafay"}            | completed               |
+--------------------------+----------------------------------------------+-------------------------+
| kubectlplan              | {"rafay.dev/clusterName":"eksfleet4"}        | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| may25-2023               | {"org":"rafay","team":"cloud-engg"}          | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| fleet-ng-upgrades        | {"org":"rafay","team":"cloud-engg"}          | cancelled               |
+--------------------------+----------------------------------------------+-------------------------+

Get Targets of a Fleet Plan Job

Run the below command to get targets of a fleet plan job

./rctl gettargets fleetplan demo-fleetplan
./rctl gettargets fleetplan demo-fleetplan
{"items":[{"status":"fail","reason":"timeout waiting for agent to ack","resource":{"name":"kalyan-eks-privatep","project":"kalyanfleetproj4"}},{"status":"fail","reason":"timeout waiting for agent to ack","resource":{"name":"kalyan-fleetimp1","project":"kalyanfleetproj4"}}],"statusCount":{"failCount":2},"metadata":{"count":2,"limit":10}}%

Get Jobs in a Fleet Plan

Use the below command to get the fleet job details

/rctl getjobs fleetplan <tfleeplan_name>

Example

./rctl getjobs fleetplan demo-fleetplan
{"metadata":{"count":29,"limit":10},"items":[{"metadata":{"name":"2023-06-12-11-11-38","createdAt":"2023-06-12T11:11:38Z","modifiedAt":"2023-06-12T11:18:04Z","ID":"x28odom"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2QMVSTZPZY8AGSPCXTRJXFG","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":2},{"metadata":{"name":"2023-06-12-09-29-28","createdAt":"2023-06-12T09:29:28Z","modifiedAt":"2023-06-12T09:37:04Z","ID":"g29wlek"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2QF0QMHXZGDBVEZWHWSN99A","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack"},{"metadata":{"name":"2023-06-12-05-52-13","createdAt":"2023-06-12T05:52:13Z","modifiedAt":"2023-06-12T05:59:43Z","ID":"qkogll2"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q2JY0FWSQCG7NSF0NM7W9Q","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":2},{"metadata":{"name":"2023-06-12-05-35-25","createdAt":"2023-06-12T05:35:25Z","modifiedAt":"2023-06-12T05:36:33Z","ID":"x28ojom"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q1M5N23H1F01P7HPT1WZD0","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: Error: : Error from server (Forbidden): nodes is forbidden: User \"system:serviceaccount:rafay-system:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: Error: : Error from server (Forbidden): nodes is forbidden: User \"system:serviceaccount:rafay-system:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope\n","resource_count":2},{"metadata":{"name":"2023-06-12-05-29-57","createdAt":"2023-06-12T05:29:57Z","modifiedAt":"2023-06-12T05:33:53Z","ID":"lk5xrw2"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q1A5777NJC3N3YCWN0W2WG","state":"completed","reason":"all activities completed","resource_count":2},{"metadata":{"name":"2023-06-08-07-11-42","createdAt":"2023-06-08T07:11:42Z","modifiedAt":"2023-06-08T07:18:27Z","ID":"1ky4e0k"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CXHKWR0MPC62H0HC7H2MEM","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack"},{"metadata":{"name":"2023-06-08-06-53-30","createdAt":"2023-06-08T06:53:30Z","modifiedAt":"2023-06-08T06:59:17Z","ID":"6kno802"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CWG9EF131J5TZCPZF8MCJC","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":5},{"metadata":{"name":"2023-06-08-06-11-25","createdAt":"2023-06-08T06:11:25Z","modifiedAt":"2023-06-08T06:13:37Z","ID":"d27xwrk"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CT37J1EZBGVYE0R2E5TKPN","state":"completed_with_failures","reason":"activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: error initializing k8s manager for activity id 01H2CT3NHN0VJTZSSRR4C1WHE7 name kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: error checking namespace rafay-system for run-4045856d681c20f8-01h2ct3nhn0vjtzssrr4c1whe7: ERROR: failed to forward request to cluster. Please retry","resource_count":5},{"metadata":{"name":"2023-06-07-18-35-11","createdAt":"2023-06-07T18:35:11Z","modifiedAt":"2023-06-07T18:42:17Z","ID":"gkj38xm"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2BJ8BSPVZ3T7DSTM38G0CPW","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack","resource_count":5},{"metadata":{"name":"2023-06-07-18-32-14","createdAt":"2023-06-07T18:32:14Z","modifiedAt":"2023-06-07T18:34:37Z","ID":"jkeznqk"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2BJ2ZVQ6F9PZ722Z9YAYENB","state":"cancelled","reason":"workflow run cancelled","resource_count":5}]}

Get Fleet Plan Status

Use the below command to know the status of a fleet plan

./rctl status fleetplan <fleetplan_name> <job_id>

Example

./rctl status fleetplan demo-fleetplan 3
{"jobStatus":{"status":"completed","reason":"all activities completed","lastUpdated":"2023-10-26T06:29:46Z"},"resourcesStatus":[{"name":"ak-aks2","project":"defaultproject","operations":[{"action":{"name":"cpupg","status":"success","reason":"activity poll again: defaultproject-ak-aks2-cpupg","lastUpdated":"2023-10-26T06:29:46Z","description":"Upgrade cp version"}}]}]}

Delete Fleet Plan

Run the below command to delete a fleet plan

./rctl delete fp <fleetplan_name> --v3

A set of comprehensive APIs are available that enable the users to seamlessly create, execute, retrieve fleet plan details, and delete fleet plans

Access the OpenAPI Explorer to view and leverage the Fleet Plan v3 APIs. Provide the project and name of the resource wherever required and execute.

New Cluster Template