Skip to content

GitOps

GitOps System Sync facilitates bidirectional synchronization between System (Rafay Controller) configuration and Git repositories. Any configuration changes performed in the Git repository get reflected in the system (Rafay Controller), and vice versa.

This framework enables a 'GitOps' first approach for orchestrating operations utilizing external triggers (Pipeline Triggers) for modifications made in Git and internal triggers whenever the artifact manifests in the Git repo need to be updated. With standardized specs, managing configurations becomes extremely efficient.


Below are examples of specs for Environment Manager resources for system sync operations:

Drivers

Container Drivers

Below is an example YAML configuration file for a Container Driver created in the project demoproject where the driver is shared and the out of cluster set to true

apiVersion: eaas.envmgmt.io/v1
kind: Driver
metadata:
  description: This is a container driver
    with kube config options
  name: sp-container1
  project: demoproject
spec:
  config:
    container:
      arguments:
      - -refresh=true
      - --log-level=2
      commands:
      - /bin/sh
      - -c
      cpuLimitMilli: "512"
      envVars:
        TF_CLI_CONFIG_FILE: config.json
        TF_LOG: DEBUG
      files:
        config.json: aGk=
        token: aGk=
      image: docker.io/user1569/security:1.1
      imagePullCredentials:
        password: dummypassword
        registry: docker.io
        username: user1569
      kubeConfigOptions:
        kubeConfig: kubeconfig
        outOfCluster: true
      kubeOptions:
        labels:
          env: qc
          release: stable
        namespace: sp-ns
        nodeSelector:
          kubernetes.io/arch: amd64
          topology.kubernetes.io/zone: us-west-2b
        securityContext:
          privileged: true
          readOnlyRootFileSystem: false
        serviceAccountName: sp
        tolerations:
        - effect: PreferNoSchedule
          key: node1
          operator: Equal
          value: value1
        - effect: NoSchedule
          key: node2
          operator: Exists
        - effect: NoExecute
          key: node3
          operator: Equal
          tolerationSeconds: 300
          value: value3
      memoryLimitMb: "1024"
      volumes:
      - mountPath: /tmp/.test
      - enableBackupAndRestore: true
        mountPath: /tmp/.test1
      - mountPath: /tmp/.test2
        pvcSizeGB: "2"
        pvcStorageClass: gp2
        usePVC: true
      workingDirPath: /security/
    successCondition: |-
      if #status.http.statusCode == 200 {
        success: true
      }
      if #status.http.statusCode != 200 {
        failed: true
        reason: "url not reachable"
      }
    timeoutSeconds: 3600
    type: container
  sharing:
    enabled: true
    projects:
    - name: '*'

Note: Users can set outOfCluster: false or simply not mention it in the config spec.


HTTP Drivers

apiVersion: eaas.envmgmt.io/v1
kind: Driver
metadata:
  description: This is a HTTP driver
  name: sp-http
  project: demoproject
spec:
  config:
    http:
      body: |-
        <body>
        <h1>This is a heading</h1>
        <p>This is a paragraph.</p>
        </body>
      endpoint: https://httpbin.org
      headers:
        Content-type: application/json
        X-TOKEN: 1234
      method: GET
    maxRetryCount: 2
    successCondition: |-
      if #status.http.statusCode == 200 {
        success: true
      }
      if #status.http.statusCode != 200 {
        failed: true
        reason: "url not reachable"
      }
    timeoutSeconds: 3600
    type: http
  sharing:
    enabled: true
    projects:
    - name: project1
    - name: project2

Context

apiVersion: eaas.envmgmt.io/v1
kind: ConfigContext
metadata:
  description: This is a config context
  name: sp-context
  project: defaultproject
spec:
  envs:
  - key: AWS_ACCESS_KEY_ID
    options:
      description: Enter the aws access key
      override:
        type: allowed
      sensitive: true
    value: accesskey
  - key: TF_CLI_CONFIG_FILE
    options:
      description: Enter the rctl config file name
      override:
        type: notallowed
    value: token
  - key: TF_VAR_eks_cluster_project
    options:
      description: Select the project
      override:
        restrictedValues:
        - sp
        - defaultproject
        - sp-git-sync
        type: restricted
      required: true
    value: defaultproject
  - key: DRIVER_DEBUG
    options:
      override:
        restrictedValues:
        - "true"
        - "false"
        type: restricted
      required: true
    value: "false"
  files:
  - data: aGk=
    name: config.json
    options:
      description: Enter the rctl config data
      override:
        type: notallowed
      sensitive: true
  - data: eyAibnNfbmFtZSIgOiAic3AiIH0K
    name: ns.tfvars.json
    options:
      override:
        type: allowed
      required: true
  variables:
  - name: aws_cloud_provider_name
    options:
      description: Enter the cloud credential name
      override:
        type: allowed
      required: true
    value: $(resource."sp-env".output.aws_cloud_provider_name.value)$
    valueType: expression
  - name: aws_cloud_provider_access_key
    options:
      override:
        type: notallowed
      sensitive: true
    value: accesskey
    valueType: text
  - name: aws_cloud_provider_secret_key
    options:
      override:
        type: allowed
      sensitive: true
    value: secretkey
    valueType: text
  - name: eks_blueprint
    options:
      override:
        restrictedValues:
        - '[default]'
        - '[minimal]'
        - '[sp]'
        type: restricted
    value: '[minimal]'
    valueType: hcl
  - name: eks_blueprint_version
    options:
      override:
        type: notallowed
    value: '{"latest"}'
    valueType: json
  sharing:
    enabled: true
    projects:
    - name: project1    

Static Resource

apiVersion: eaas.envmgmt.io/v1
kind: Resource
metadata:
  description: This is a static resource
  name: demo-statresource
  project: defaultproject
spec:
  variables:
  - name: eks_cluster_project
    options:
      description: Enter the project name
    value: demo
    valueType: text
  - name: aws_cloud_provider_name
    value: '[demo]'
    valueType: hcl
  - name: aws_cloud_provider_access_key
    options:
      sensitive: true
    value: accesskey
    valueType: text
  - name: eks_blueprint
    value: '[minimal]'
    valueType: hcl
  - name: blueprint_version
    value: '{"latest"}'
    valueType: json
  - name: eks_cluster_name
    value: $(environment.name)$
    valueType: expression

Resource Template

Refer to the Resource Template Config Spec below for different providers.

OpenTofu

Below is an example YAML configuration file for a Resource Template with OpenTofu Provider created in the project demoproject

apiVersion: eaas.envmgmt.io/v1
kind: ResourceTemplate
metadata:
  name: demo-rt-ot
  project: demoproject
  description: This is a resource template with OpenTofu provider
  annotations:
    eaas.envmgmt.io/github: https://github.com/user1-rafay/envmgr-demo
    envmgmt.io/project-limits: "3"
  labels:
    env: qc
    release: stable
spec:
  agents:
  - name: sp-agent
  - name: demo-agent
  artifactDriver:
    name: demo-art
  contexts:
  - name: demo1
  - name: demo2
  hooks:
    onInit: # Supported types are onInit, onSuccess, onFailure and onCompletion
    - agents:
      - name: sp-agent
      name: res-hook-on-init-approval
      onFailure: continue
      options:
        approval:
          type: internal
      timeoutSeconds: 3600
      type: approval
    - agents:
      - name: sp-agent
      dependsOn:
      - res-hook-on-init-approval
      name: res-on-init-container
      onFailure: continue
      options:
        container:
          arguments:
          - -refresh=true
          - --log-level=2
          commands:
          - /bin/sh
          - -c
          cpuLimitMilli: "512"
          envvars:
            AWS_ACCESS_KEY_ID: "accesskey"
            AWS_SECRET_ACCESS_KEY: "secretkey"
          image: docker.io/user1569/security:1.1
          memoryLimitMB: "1024"
          successCondition: |-
            if #status.http.statusCode == 200 {
              success: true
            }
            if #status.http.statusCode != 200 {
              failed: true
              reason: "url not reachable"
            }
          workingDirPath: /security/
      timeoutSeconds: 3598
      type: container
    provider:
      opentofu:
        deploy:
          init: # Supported types are init, plan, apply and output
            before: # Supported types are before and after
            - agents:
              - name: sp-agent
              name: provider-deploy-success-http-before
              onFailure: continue
              options:
                http:
                  body: <h1>This is a heading</h1>
                  endpoint: https://httpbin.org
                  headers:
                    Content-type: application/json
                    X-TOKEN: 1234
                  method: GET
                  successCondition: |-
                    if #status.http.statusCode == 200 {
                      success: true
                    }
                    if #status.http.statusCode != 200 {
                      failed: true
                      reason: "url not reachable"
                    }
              timeoutSeconds: 3600
              type: http
        destroy:
          init: # Supported types are init, plan and destroy
            after: # Supported types are before and after
            - agents:
              - name: sp-agent
              driver:
                name: custom-fm-opentofu-driver
              name: provider-destroy-after-init-driver
              onFailure: continue
              timeoutSeconds: 2600
              type: driver
  provider: opentofu
  providerOptions:
    driver:
      name: custom-fm-opentofu-driver
    opentofu:
      backendConfigs:
      - key=s3bucketname
      - region=us-west-2
      - encrypt=true
      backendType: custom # Supported types are system and custom (Don't specify the backendType for system)
      lock: true
      lockTimeoutSeconds: 2600
      pluginDirs:
      - plugin1
      - plugin2
      refresh: true
      timeoutSeconds: 3600
      varFiles:
      - ns.tfvars.json
      - bp.tfvars.json
      version: 1.7.2 # Don't specify the OpenTofu version if an output driver is used. Supported versions are 1.6.2, 1.7.2 and latest
      volumes:
      - mountPath: /tmp/.test
      - enableBackupAndRestore: true
        mountPath: /tmp/.test1
      - mountPath: /tmp/.test2
        pvcSizeGB: "2"
        pvcStorageClass: gp2
        usePVC: true
  repositoryOptions:
    branch: main
    directoryPath: cloud-creds
    name: demo-envmgr
  variables:
  - name: aws_cloud_provider_name
    options:
      description: Enter the cloud credential name
      override:
        type: allowed
      required: true
    value: '[demo]'
    valueType: hcl
  - name: aws_cloud_provider_access_key
    options:
      override:
        type: allowed
      sensitive: true
    value: accesskey
    valueType: text
  - name: aws_cloud_provider_secret_key
    options:
      override:
        type: notallowed
      sensitive: true
    value: secretkey
    valueType: text
  - name: eks_cluster_project
    options:
      override:
        restrictedValues:
        - demo
        - defaultproject
        - demo-git-sync
        type: restricted
    value: defaultproject
    valueType: text
  - name: eks_blueprint
    options:
      override:
        type: allowed
    value: '{"default"}'
    valueType: json
  - name: rafay_config_file
    options:
      override:
        type: notallowed
    value: config.json
    valueType: text
  - name: eks_cluster_name
    options:
      override:
        type: allowed
    value: $(environment.name)$
    valueType: expression
  version: v1
  sharing:
    enabled: true
    projects:
    - name: project1

Below is an example YAML configuration file for a Resource Template using inline drivers created in the project demoproject

apiVersion: eaas.envmgmt.io/v1
kind: ResourceTemplate
metadata:
  name: demo-rt-inline-driver
  project: demoproject
spec:
  provider: opentofu
  providerOptions:
    driver:
      data:
        config:
          container:
            cpuLimitMilli: "512"
            image: registry.dev.rafay-edge.net/rafay/opentofu-driver:main-48
            kubeConfigOptions:
              kubeConfig: kubeconfig
            kubeOptions:
              tolerations:
              - effect: NoSchedule
                key: node1
                operator: Equal
                value: value1
              - effect: NoSchedule
                key: driver1
                operator: Exists
            memoryLimitMb: "1024"
          type: container
      name: demo-inline
    openTofu:
      backendType: system
      lock: true
      refresh: true
      timeoutSeconds: 1800
  repositoryOptions:
    branch: master
    directoryPath: infrastructure-as-code/aws-ec2-instance
    name: demo-sync
  version: v1

HCP Terraform

Below is an example YAML configuration file for a Resource Template with HCP Terraform Provider created in the project demoproject

apiVersion: eaas.envmgmt.io/v1
kind: ResourceTemplate
metadata:
  name: demoproject
  project: demoproject
  description: This is a resource template with HCP Terraform provider
  annotations:
    eaas.envmgmt.io/github: https://github.com/user1-rafay/envmgr-demo
    envmgmt.io/project-limits: "3"
  labels:
    env: qc
    release: stable
spec:
  agents:
  - name: sp-agent1
  - name: demo-agent
  artifactDriver:
    name: demo-art
  contexts:
  - name: demo1
  - name: demo2
  hooks:
    onInit: # Supported types are onInit, onSuccess, onFailure and onCompletion
    - agents:
      - name: sp-agent1
      name: res-hook-on-init-approval
      onFailure: continue
      options:
        approval:
          type: internal
      timeoutSeconds: 3600
      type: approval
    - agents:
      - name: sp-agent1
      dependsOn:
      - res-hook-on-init-approval
      - res-hook-on-init-approval
      name: res-on-init-container
      onFailure: continue
      options:
        container:
          arguments:
          - -refresh=true
          - --log-level=2
          commands:
          - /bin/sh
          - -c
          cpuLimitMilli: "512"
          envvars:
            AWS_ACCESS_KEY_ID: "accesskey"
            AWS_SECRET_ACCESS_KEY: "secretkey"
          image: docker.io/user1569/security:1.1
          memoryLimitMB: "1024"
          successCondition: |-
            if #status.http.statusCode == 200 {
              success: true
            }
            if #status.http.statusCode != 200 {
              failed: true
              reason: "url not reachable"
            }
          workingDirPath: /security/
      timeoutSeconds: 3598
      type: container
    provider:
      hcpterraform:
        deploy:
          init: # Supported types are init, plan, apply and output
            before: # Supported types are before and after
            - agents:
              - name: sp-agent1
              name: provider-deploy-success-http-before
              onFailure: continue
              options:
                http:
                  body: <h1>This is a heading</h1>
                  endpoint: https://httpbin.org
                  headers:
                    Content-type: application/json
                    X-TOKEN: 1234
                  method: GET
                  successCondition: |-
                    if #status.http.statusCode == 200 {
                      success: true
                    }
                    if #status.http.statusCode != 200 {
                      failed: true
                      reason: "url not reachable"
                    }
              timeoutSeconds: 3600
              type: http
        destroy:
          init: # Supported types are init, plan and destroy
            after: # Supported types are before and after
            - agents:
              - name: sp-agent1
              driver:
                name: custom-fm-opentofu-driver
              name: provider-destroy-after-init-driver
              onFailure: continue
              timeoutSeconds: 2600
              type: driver
  provider: hcpterraform
  providerOptions:
    driver:
      name: custom-fm-opentofu-driver
    hcpterraform:
      lock: true
      lockTimeoutSeconds: 2600
      pluginDirs:
      - plugin1
      - plugin2
      refresh: true
      timeoutSeconds: 3600
      varFiles:
      - ns.tfvars.json
      - bp.tfvars.json
      volumes:
      - mountPath: /tmp/.test
      - enableBackupAndRestore: true
        mountPath: /tmp/.test1
      - mountPath: /tmp/.test2
        pvcSizeGB: "2"
        pvcStorageClass: gp2
        usePVC: true
  repositoryOptions:
    branch: main
    directoryPath: cloud-creds
    name: demo-envmgr
  variables:
  - name: aws_cloud_provider_name
    options:
      description: Enter the cloud credential name
      override:
        type: allowed
      required: true
    value: '[demo]'
    valueType: hcl
  - name: aws_cloud_provider_access_key
    options:
      override:
        type: allowed
      sensitive: true
    value: accesskey
    valueType: text
  - name: aws_cloud_provider_secret_key
    options:
      override:
        type: notallowed
      sensitive: true
    value: secretkey
    valueType: text
  - name: eks_cluster_project
    options:
      override:
        restrictedValues:
        - sp
        - defaultproject
        - sp-git-sync
        type: restricted
    value: defaultproject
    valueType: text
  - name: eks_blueprint
    options:
      override:
        type: allowed
    value: '{"default"}'
    valueType: json
  - name: rafay_config_file
    options:
      override:
        type: notallowed
    value: config.json
    valueType: text
  - name: eks_cluster_name
    options:
      override:
        type: allowed
    value: $(environment.name)$
    valueType: expression
  version: v1
  sharing:
    enabled: true
    projects:
    - name: project1

Custom Provider

Below is an example YAML configuration file for a Resource Template with Custom Provider created in the project defaultproject

apiVersion: eaas.envmgmt.io/v1
kind: ResourceTemplate
metadata:
  name: demo-provider
  project: defaultproject
spec:
  provider: custom
  providerOptions:
    custom:
      tasks:
        - agents:
            - name: agent1
          onFailure: continue
          timeoutSeconds: 300
          type: driver
          name: task1
          driver:
            name: driver1
        - agents:
            - name: agent2
          onFailure: continue
          timeoutSeconds: 600
          type: driver
          name: task2
          driver:
            name: driver2
          dependsOn:
            - task1
  version: v1

Environment Template

apiVersion: eaas.envmgmt.io/v1
kind: EnvironmentTemplate
metadata:
  name: demo-et3
  project: project1
  description: This is an environment template
  displayName: demo-environment-template
  annotations:
    eaas.envmgmt.io/category: AWS,Nvidia,AI/ML
    envmgmt.io/project-limits: "2"
  labels:
    env: qc
    release: stable
spec:
  agents:
  - name: demo-agent1
  - name: demo-scale
  contexts:
  - name: demo1
  - name: demo2
  hooks:
    onFailure:
    - agents:
      - name: demo-agent1
      driver:
        name: demo-art
      name: onfailure-driver
      onFailure: continue
      timeoutSeconds: 3600
      type: driver
    onInit:
    - agents:
      - name: demo-agent1
      name: oninit-approval
      onFailure: continue
      options:
        approval:
          type: internal
      timeoutSeconds: 3600
      type: approval
    - agents:
      - name: demo-agent1
      dependsOn:
      - oninit-approval
      name: oninit-container
      onFailure: continue
      options:
        container:
          arguments:
          - -refresh=false
          - --log-level=2
          commands:
          - /bin/sh
          - -c
          cpuLimitMilli: "512"
          envvars:
            DOWNLOAD_TOKEN: "token"
            DOWNLOAD_URL: "url"
          image: docker.io/user1569/security:1.1
          memoryLimitMB: "1024"
          successCondition: |-
            if #status.http.statusCode == 200 {
              success: true
            }
            if #status.http.statusCode != 200 {
              failed: true
              reason: "url not reachable"
            }
          workingDirPath: /security/
      timeoutSeconds: 3600
      type: container
    onSuccess:
    - agents:
      - name: demo-agent1
      name: onsuccess-http
      onFailure: continue
      options:
        http:
          body: <h1>This is a heading</h1>
          endpoint: https://httpbin.org
          headers:
            Content-type: application/json
            X-TOKEN: 1234
          method: GET
          successCondition: |-
            if #status.http.statusCode == 200 {
              success: true
            }
            if #status.http.statusCode != 200 {
              failed: true
              reason: "url not reachable"
            }
      timeoutSeconds: 3600
      type: http
  iconURL: iconurl
  readme: |-
    This is an
    environment template
    with all options
  resources:
  - kind: resourcetemplate
    name: demo-rt-hcp
    resourceOptions:
      dedicated: true
      version: v1
    type: dynamic
  - kind: resource
    name: demo-stat
    type: static
  - kind: environment
    name: demo-env
    type: static
  - kind: resourcetemplate
    name: demo-tf1
    resourceOptions:
      version: v1
    type: dynamic
  - dependsOn:
    - name: tf1
    kind: resourcetemplate
    name: demo-rt-tf
    resourceOptions:
      version: v1
    type: dynamic    
  variables:
  - name: aws_cloud_provider_name
    options:
      description: Enter the cloud credential name
      override:
        type: allowed
      required: true
    value: '[demo]'
    valueType: hcl
  - name: aws_cloud_provider_access_key
    options:
      override:
        type: allowed
      sensitive: true
    value: accesskey
    valueType: text
  - name: aws_cloud_provider_secret_key
    options:
      override:
        type: notallowed
      sensitive: true
    value: secretkey
    valueType: text
  - name: eks_cluster_project
    options:
      override:
        restrictedValues:
        - demo
        - defaultproject
        - project1
        type: restricted
    value: defaultproject
    valueType: text
  - name: eks_blueprint
    options:
      override:
        type: allowed
    value: '{"default"}'
    valueType: json
  - name: rafay_config_file
    options:
      override:
        type: notallowed
    value: config.json
    valueType: text
  - name: eks_cluster_name
    options:
      override:
        type: allowed
    value: $(environment.name)$
    valueType: expression    
  version: v1
  sharing:
    enabled: true
    projects:
    - name: project2  

Environment

apiVersion: eaas.envmgmt.io/v1
kind: Environment
metadata:
  name: sp-env2
  project: demoproject
  description: This is an environment
spec:
  agents:
  - name: sp-scale
  - name: sp-agent1
  envVars:
  - key: AWS_ACCESS_KEY_ID
    options:
      override:
        type: allowed
      required: true
      sensitive: true
    value: accesskey
  files:
  - data: aGk=
    mountPath: config.json
    options:
      override:
        type: allowed
      required: true
      sensitive: true
  sharing:
    enabled: true
    projects:
    - name: project1
  template:
    name: sp-et3
    version: v1
  variables:
  - name: aws_cloud_provider_name
    options:
      description: Enter the cloud credential name
      override:
        type: allowed
      required: true
    value: '[sp]'
    valueType: hcl # Supported values: text, hcl, json and expression
  - name: aws_cloud_provider_access_key
    options:
      override:
        type: allowed
      sensitive: true
    value: accesskey
    valueType: text