Skip to content

Rafay Agent Overview

A Rafay Agent is a service that runs in your local network or VPC. This is required for executing tasks related to provisioning compute or service instances, based on defined SKUs, for GPU PaaS.

The agent can be deployed on either Docker or Kubernetes.

Type Versions
Docker v2.x or higher
Kubernetes Currently supported version

👉 To create and configure an agent, follow the instructions here: GitOps Agent Setup


Worker Model

The GitOps agent provisions workers (pods) to execute GPU PaaS instance-related tasks.

  • By default, each agent can spawn up to 10 workers on demand
  • Each worker executes one activity at a time
  • A single instance deployment may trigger multiple parallel activities
  • Workers are ephemeral, created only when needed, and terminated once tasks complete

Instance Execution Lifecycle

The lifecycle of provisioning and deprovisioning GPU PaaS instances involves three high-level stages:

  1. Activity Generation
  2. The Controller determines the number and types of required activities

  3. Agent Interaction

  4. Agents continuously poll the Controller for activities
  5. The Controller assigns activities based on agent association and worker limits

  6. Agent Execution

  7. A new pod is spawned per activity
  8. Activities may run sequentially or in parallel (depending on underlying code)
  9. Once completed, pods terminate and resources are released
  10. Agents continue polling and queueing additional tasks

Note: Every instance created at the PaaS layer corresponds to an underlying environment object.


Worker Sizing

Default Resource Requests and Limits per Worker:

Worker Type CPU Memory
Git Worker 250m 512Mi
OpenTofu Worker 500m 1Gi
Function Worker 100m 256Mi

Resource requests and limits are consistent across worker types.


Capacity Planning & Scaling

  • Ensure the Kubernetes cluster or Docker host has adequate resources for agents and workers
  • Add a 10–20% buffer to handle peak workloads
  • Configure a high maximum worker count based on anticipated parallel instance activities

Reliability & Failover
- Deploy at least two agents per template or as part of Global Settings
- If one agent fails or reaches full capacity, the other automatically continues processing


Agent Assignment & Selection

Agents can be configured either within an environment template or centrally in Global Settings.

👉 For detailed instructions, see Global Overrides.