Rafay Agent Overview¶
A Rafay Agent is a service that runs in your local network or VPC. This is required for executing tasks related to provisioning compute or service instances, based on defined SKUs, for GPU PaaS.
The agent can be deployed on either Docker or Kubernetes.
Type | Versions |
---|---|
Docker | v2.x or higher |
Kubernetes | Currently supported version |
👉 To create and configure an agent, follow the instructions here: GitOps Agent Setup
Worker Model¶
The GitOps agent provisions workers (pods) to execute GPU PaaS instance-related tasks.
- By default, each agent can spawn up to 10 workers on demand
- Each worker executes one activity at a time
- A single instance deployment may trigger multiple parallel activities
- Workers are ephemeral, created only when needed, and terminated once tasks complete
Instance Execution Lifecycle¶
The lifecycle of provisioning and deprovisioning GPU PaaS instances involves three high-level stages:
- Activity Generation
-
The Controller determines the number and types of required activities
-
Agent Interaction
- Agents continuously poll the Controller for activities
-
The Controller assigns activities based on agent association and worker limits
-
Agent Execution
- A new pod is spawned per activity
- Activities may run sequentially or in parallel (depending on underlying code)
- Once completed, pods terminate and resources are released
- Agents continue polling and queueing additional tasks
Note: Every instance created at the PaaS layer corresponds to an underlying environment object.
Worker Sizing¶
Default Resource Requests and Limits per Worker:
Worker Type | CPU | Memory |
---|---|---|
Git Worker | 250m | 512Mi |
OpenTofu Worker | 500m | 1Gi |
Function Worker | 100m | 256Mi |
Resource requests and limits are consistent across worker types.
Capacity Planning & Scaling¶
- Ensure the Kubernetes cluster or Docker host has adequate resources for agents and workers
- Add a 10–20% buffer to handle peak workloads
- Configure a high maximum worker count based on anticipated parallel instance activities
Reliability & Failover
- Deploy at least two agents per template or as part of Global Settings
- If one agent fails or reaches full capacity, the other automatically continues processing
Agent Assignment & Selection¶
Agents can be configured either within an environment template or centrally in Global Settings.
👉 For detailed instructions, see Global Overrides.