Automation
For purposes of automation, it is strongly recommended that users create and manage Fleet Plans and manage various operations of clusters. This is well suited for scenarios where the multiple cluster requires a set of similar operations
A Fleet Plan can be created and managed via two automation methods:
- RCTL CLI
- Swagger API
Fleet Plan Automation Lifecycle¶
Create and Execute Fleet Plan¶
Command to create and run the Fleet Plan
./rctl apply -f <fleetplan_filename.yaml>
Below is an example of yaml file to create a Fleet Plan named demo_fleetplan with the configuration details provided in Step 1
kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
name: demo_fleetplan
project: fleetpro2
spec:
fleet:
kind: clusters
labels:
role: qa
user: demo_user
projects:
- name: fleetpro2
- name: fleetproj4
- name: fleetpro
- name: proj3
operationWorkflow:
operations:
- name: operation-fleetclusters
prehooks:
- description: precheck for operation1
inject:
- KUBECONFIG
name: executekubent
containerConfig:
runner: agent
image: ghcr.io/doitintl/kube-no-trouble:latest
arguments:
- '-o'
- json
- '-e'
commands:
- /app/kubent
timeoutSeconds: 120
successCondition: 'if #status.container.exitCode == 0 { success: false }'
action:
type: patch
description: set max nodes count
name: setmaxnodecount
patchConfig:
- op: replace
path: .spec.config.managedNodeGroups[0].maxSize
value: 33
continueOnFailure: true
posthooks:
- description: post check after operation
inject:
- KUBECONFIG
name: executeposthook
containerConfig:
runner: agent
image: ghcr.io/doitintl/kube-no-trouble:latest
arguments:
- '-o'
- json
- '-e'
commands:
- /app/kubent
timeoutSeconds: 120
successCondition: 'if #status.container.exitCode != 0 { failed: true }'
agents:
- name: gitops-agent1
- name: gitops-agent2
Another example provided below illustrates the creation of a Fleet Plan with a GPU-based posthook and actions to update the blueprint.
kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
name: fleet-eks-update-bp-version
project: demofleet
spec:
fleet:
kind: clusters
labels:
cluster-type: eks
env: eks-demo
projects:
- name: demofleet
- name: platform
operationWorkflow:
operations:
- name: update-bp-version
prehooks:
- description: list-all-pods
inject:
- KUBECONFIG
name: list-all-pods
containerConfig:
runner: cluster
image: alpine/k8s:1.24.16
arguments:
- -c
- kubectl get pod -A -o wide
commands:
- /bin/sh
timeoutSeconds: 120
successCondition: 'if #status.container.exitCode == 0 { success: false }'
action:
type: patch
description: update-bp-version
name: update-bp-version
patchConfig:
- op: replace
path: .spec.blueprintConfig.name
value: eks-standard-bp
- op: replace
path: .spec.blueprintConfig.version
value: v1.1
continueOnFailure: true
posthooks:
- description: re-check-list-all-pods
inject:
- KUBECONFIG
name: re-check-list-all-pods
containerConfig:
runner: cluster
image: alpine/k8s:1.24.16
arguments:
- -c
- kubectl get pod -A -o wide
commands:
- /bin/sh
- description: gpu benchmarks
inject:
- KUBECONFIG
name: gpu-benchmark
containerConfig:
runner: cluster
image: cemizm/tf-benchmark-gpu
timeoutSeconds: 120
successCondition: 'if #status.container.exitCode != 0 { failed: true }'
agents:
- name: demofleet-gitops-agents
The below example illustrates the creation of a Fleet Plan with HTTP Config Type prehook.
kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
name: demo-fp
project: defaultproject
spec:
fleet:
kind: clusters
labels:
key: value1
projects:
- name: defaultproject
operationWorkflow:
operations:
- name: op1
prehooks:
- description: Adding HTTP API
name: prehook2
httpConfig:
method: GET
endpoint: 'https://google.com'
headers:
agent: chrome
timeoutSeconds: 10
successCondition: |
if #status.container.exitCode == 0 {
success: true
}
if #status.container.exitCode != 0 {
failed: true
}
action:
type: patch
name: action1
description: updating nodeGroups
patchConfig:
- op: replace
path: '.spec.config.nodeGroups[0].desiredCapacity'
value: 2
continueOnFailure: true
Here is an example of a fleet plan with an HTTP prehook, which will make an API call to retrieve the edge response. The SuccessCondition is configured such that if the health status of the cluster is determined as healthy, then the prehook will be marked as successful, enabling the Cluster Upgrade action to proceed.
kind: FleetPlan
apiVersion: infra.k8smgmt.io/v3
metadata:
name: demo-fleetplan
project: demo-project1
spec:
fleet:
kind: clusters
labels:
env: stage
team: cloud
projects:
- name: dev-project
- name: qa-project
operationWorkflow:
operations:
- name: fleet-operation1
prehooks:
- inject:
- KUBECONFIG
name: http-prehook1
httpConfig:
endpoint: >-
http://qc-console.stage.rafay.dev/edge/v1/projects/pk0nl7m/edges/249e4nk/
method: GET
headers:
Cookie: >-
csrftoken=zbZhWc1PlLB8URiYCmM5uXO3pJm98OLH3ETLJJEEvrtRMn3FvWELyThDgxOr6xp8;
rsid=yph7k8w5z26k1or7jyhg3otjoihwixis
timeoutSeconds: 300
successCondition: "if #status.http.body.health == 1 {\n\tsuccess: true\n}\nif #status.http.body.health != 1 {\n\tfailed: true\n}"
action:
type: controlPlaneUpgrade
name: cp-upgrade
controlPlaneUpgradeConfig:
version: '1.28'
Important
Users can leverage the Fleet Plan only with V3 configuration
Execute Fleet Plan¶
To execute a fleet plan, use the below command
./rctl execute fleetplan <fleetplan_name> --v3
Example
./rctl execute fleetplan rctl-june6th-2ops --v3
{"metadata":{"name":"2023-06-07-12-46-19","createdAt":"2023-06-07T12:46:19Z","modifiedAt":"2023-06-07T12:46:19Z","ID":"dk6vw21"},"fleet_plan_id":"dkgyrkx","workflow_id":"01H2AY9JC8WYPNZJCXHCYJW5M2","state":"pending"}%
Get Fleet Plan¶
Use this command to retrieve a specific fleet plan detailed information
./rctl get fp <fleetplan_name> --v3
Below is the illustrative example of the "upgrade202023" fleetplan information:
./rctl get fp upgrade202023 --v3
+----------------+-------------------------------------+-----------+
| FLEETPLAN NAME | FLEET LABELS | STATUS |
+----------------+-------------------------------------+-----------+
| upgrade202023 | {"org":"rafay","team":"cloud-engg"} | completed |
+----------------+-------------------------------------+-----------+
You can use the below commands to get more information of the fleetplan in json or yaml format
./rctl get fp <fleetplan_name> --v3 -o json
(or)
./rctl get fp <fleetplan_name> --v3 -o yaml
Example
./rctl get fp upgrade202023 --v3 -o yaml
apiVersion: infra.k8smgmt.io/v3
kind: FleetPlan
metadata:
name: upgrade202023
project: fleetproj4
spec:
fleet:
kind: clusters
labels:
org: rafay
team: cloud-engg
projects:
- name: fleetproj4
- name: fleet-proj3
- name: fleetpro
- name: fleetpro2
operationWorkflow:
operations:
- action:
description: upgrade ng upgrade
name: ngk8sgrade
nodeGroupsUpgradeConfig:
names:
- managng1
- fleet2-selfngroup1
- ng-30c2bf34
version: "1.24"
type: nodeGroupsUpgrade
name: upgradeks8
status:
jobStatus:
lastUpdated: "2023-06-01T05:29:13Z"
reason: all activities completed
status: completed
resourcesStatus:
- name: fleet02
operations:
- action:
description: upgrade ng upgrade
lastUpdated: "2023-06-01T05:29:13Z"
name: ngk8sgrade
reason: Desired and deployed config are same
status: success
project: fleetpro2
- name: fleet2-eks
operations:
- action:
description: upgrade ng upgrade
lastUpdated: "2023-06-01T05:29:13Z"
name: ngk8sgrade
reason: Desired and deployed config are same
status: success
project: fleetpro
Use this command to retrieve/list all the fleetplans
./rctl get fleetplans --v3
An illustrative example given below:
./rctl get fleetplans --v3
+--------------------------+----------------------------------------------+-------------------------+
| FLEETPLAN NAME | FLEET LABELS | STATUS |
+--------------------------+----------------------------------------------+-------------------------+
| upgrade202023 | {"org":"rafay","team":"cloud-engg"} | completed |
+--------------------------+----------------------------------------------+-------------------------+
| fleetplan4-upgrade-k8s | {"org":"rafay","team":"cloud-engg"} | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| fleetplan-docker | {"role":"qa","user":"demo-rafay"} | completed |
+--------------------------+----------------------------------------------+-------------------------+
| kubectlplan | {"rafay.dev/clusterName":"eksfleet4"} | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| may25-2023 | {"org":"rafay","team":"cloud-engg"} | completed_with_failures |
+--------------------------+----------------------------------------------+-------------------------+
| fleet-ng-upgrades | {"org":"rafay","team":"cloud-engg"} | cancelled |
+--------------------------+----------------------------------------------+-------------------------+
Get Targets of a Fleet Plan Job¶
Run the below command to get targets of a fleet plan job
./rctl gettargets fleetplan demo-fleetplan
./rctl gettargets fleetplan demo-fleetplan
{"items":[{"status":"fail","reason":"timeout waiting for agent to ack","resource":{"name":"kalyan-eks-privatep","project":"kalyanfleetproj4"}},{"status":"fail","reason":"timeout waiting for agent to ack","resource":{"name":"kalyan-fleetimp1","project":"kalyanfleetproj4"}}],"statusCount":{"failCount":2},"metadata":{"count":2,"limit":10}}%
Get Jobs in a Fleet Plan¶
Use the below command to get the fleet job details
/rctl getjobs fleetplan <tfleeplan_name>
Example
./rctl getjobs fleetplan demo-fleetplan
{"metadata":{"count":29,"limit":10},"items":[{"metadata":{"name":"2023-06-12-11-11-38","createdAt":"2023-06-12T11:11:38Z","modifiedAt":"2023-06-12T11:18:04Z","ID":"x28odom"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2QMVSTZPZY8AGSPCXTRJXFG","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":2},{"metadata":{"name":"2023-06-12-09-29-28","createdAt":"2023-06-12T09:29:28Z","modifiedAt":"2023-06-12T09:37:04Z","ID":"g29wlek"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2QF0QMHXZGDBVEZWHWSN99A","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack"},{"metadata":{"name":"2023-06-12-05-52-13","createdAt":"2023-06-12T05:52:13Z","modifiedAt":"2023-06-12T05:59:43Z","ID":"qkogll2"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q2JY0FWSQCG7NSF0NM7W9Q","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":2},{"metadata":{"name":"2023-06-12-05-35-25","createdAt":"2023-06-12T05:35:25Z","modifiedAt":"2023-06-12T05:36:33Z","ID":"x28ojom"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q1M5N23H1F01P7HPT1WZD0","state":"completed_with_failures","reason":"2 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: Error: : Error from server (Forbidden): nodes is forbidden: User \"system:serviceaccount:rafay-system:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope\n\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: Error: : Error from server (Forbidden): nodes is forbidden: User \"system:serviceaccount:rafay-system:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope\n","resource_count":2},{"metadata":{"name":"2023-06-12-05-29-57","createdAt":"2023-06-12T05:29:57Z","modifiedAt":"2023-06-12T05:33:53Z","ID":"lk5xrw2"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2Q1A5777NJC3N3YCWN0W2WG","state":"completed","reason":"all activities completed","resource_count":2},{"metadata":{"name":"2023-06-08-07-11-42","createdAt":"2023-06-08T07:11:42Z","modifiedAt":"2023-06-08T07:18:27Z","ID":"1ky4e0k"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CXHKWR0MPC62H0HC7H2MEM","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack"},{"metadata":{"name":"2023-06-08-06-53-30","createdAt":"2023-06-08T06:53:30Z","modifiedAt":"2023-06-08T06:59:17Z","ID":"6kno802"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CWG9EF131J5TZCPZF8MCJC","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack","resource_count":5},{"metadata":{"name":"2023-06-08-06-11-25","createdAt":"2023-06-08T06:11:25Z","modifiedAt":"2023-06-08T06:13:37Z","ID":"d27xwrk"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2CT37J1EZBGVYE0R2E5TKPN","state":"completed_with_failures","reason":"activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: error initializing k8s manager for activity id 01H2CT3NHN0VJTZSSRR4C1WHE7 name kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: error checking namespace rafay-system for run-4045856d681c20f8-01h2ct3nhn0vjtzssrr4c1whe7: ERROR: failed to forward request to cluster. Please retry","resource_count":5},{"metadata":{"name":"2023-06-07-18-35-11","createdAt":"2023-06-07T18:35:11Z","modifiedAt":"2023-06-07T18:42:17Z","ID":"gkj38xm"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2BJ8BSPVZ3T7DSTM38G0CPW","state":"completed_with_failures","reason":"5 problems:\n\n- activity failed: kalyanfleetproj4-kalyan-fleetimp1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyanfleetproj4-kalyan-eks-privatep-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleetpro-kalyan-proj1-eks1-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyanv123-proj3-docker-phook: timeout waiting for agent to ack\n- activity failed: kalyan-fleet-proj3-kalyan-import-proj3-docker-phook: timeout waiting for agent to ack","resource_count":5},{"metadata":{"name":"2023-06-07-18-32-14","createdAt":"2023-06-07T18:32:14Z","modifiedAt":"2023-06-07T18:34:37Z","ID":"jkeznqk"},"fleet_plan_id":"gkjx4m0","workflow_id":"01H2BJ2ZVQ6F9PZ722Z9YAYENB","state":"cancelled","reason":"workflow run cancelled","resource_count":5}]}
Get Fleet Plan Status¶
Use the below command to know the status of a fleet plan
./rctl status fleetplan <fleetplan_name> <job_id>
Example
./rctl status fleetplan demo-fleetplan 3
{"jobStatus":{"status":"completed","reason":"all activities completed","lastUpdated":"2023-10-26T06:29:46Z"},"resourcesStatus":[{"name":"ak-aks2","project":"defaultproject","operations":[{"action":{"name":"cpupg","status":"success","reason":"activity poll again: defaultproject-ak-aks2-cpupg","lastUpdated":"2023-10-26T06:29:46Z","description":"Upgrade cp version"}}]}]}
Delete Fleet Plan¶
Run the below command to delete a fleet plan
./rctl delete fp <fleetplan_name> --v3
A set of comprehensive APIs are available that enable the users to seamlessly create, execute, retrieve fleet plan details, and delete fleet plans
Access the OpenAPI Explorer to view and leverage the Fleet Plan v3 APIs. Provide the project and name of the resource wherever required and execute.