GenAI Services Setup
This section outlines the steps required to install and configure the GenAI services stack on a GPU PaaS controller. The setup deploys the GAAP controller, supporting services, AI gateway components, and the dependencies required for running GenAI workloads.
Install Helm¶
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
````
## Configure PostgreSQL for Temporal
Edit the PostgreSQL configuration in the `rafay-core` namespace:
```bash
kubectl edit postgresqls -n rafay-core postgres-admin
Add the following databases and users:
databases:
temporal: temporaldbuser
temporal_visibility1: temporalvdbuser
users:
temporaldbuser: []
temporalvdbuser: []
Set OpenEBS HostPath as the Default StorageClass¶
kubectl patch storageclass openebs-hostpath \
-p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Update Ops Console Routing¶
Remove the /api route from the Ops Console VirtualService:
kubectl edit vs -n istio-system opsconsole-ingress-vs
Add the GAAP Controller Helm Repository¶
helm repo add gaap-controller https://rafaysystems.github.io/gaap-controller/
helm repo update
Install the GAAP Controller¶
helm install gaap-controller gaap-controller/gaap-controller --version 0.3.25 -n rafay-core -f values.yaml
Upgrade Controller Images (Optional)¶
helm upgrade gaap-controller gaap-controller/gaap-controller --set
fullnameOverride=gaap-controller -n rafay-core --version 0.3.25 -f values.yaml
values.yaml for GPU PaaS Deployment¶
This values file includes reduced resource requests and the required GenAI service components.
replicaCount: 1
image:
repository: registry.dev.rafay-edge.net/rafay/gaap-controller
pullPolicy: IfNotPresent
tag: "latest"
tokenExchangeConfig:
image:
repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
tag: "main-38"
replicaCount: 1
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
idpConfig:
image:
repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
tag: "main-38"
replicaCount: 1
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
authProxyConfig:
image:
repository: registry.dev.rafay-edge.net/rafay/ai-auth-proxy
tag: "main-273"
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
aisrvConfig:
image:
repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
tag: "main-38"
replicaCount: 1
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
images:
analyzer: registry.dev.rafay-edge.net/finetune:v9-compressor
evaluation: registry.dev.rafay-edge.net/finetune:v9-compressor
inference: registry.dev.rafay-edge.net/rafay/triton:v19-openai
quantization: registry.dev.rafay-edge.net/finetune:v9-compressor
training: registry.dev.rafay-edge.net/finetune:v9-compressor
operator: registry.dev.rafay-edge.net/rafay/titan-core-cluster:main-38
aig_extproc: registry.dev.rafay-edge.net/rafay/titan-core-extproc:main-38
envoy_gateway: docker.io/envoyproxy/gateway:v1.6.0
envoy_ratelimit: docker.io/envoyproxy/ratelimit:99d85510
ai_gateway_controller: docker.io/envoyproxy/ai-gateway-controller:v0.4.0
rafay_csi_driver: registry.dev.rafay-edge.net/dev/ai-repo-csi-driver:20251118012037
compute:
images:
repository: registry.dev.rafay-edge.net/rafay/gaap-compute-operator
tag: "main-47"
syncer:
images:
repository: registry.dev.rafay-edge.net/rafay/titan-core-cluster
tag: "main-38"
rack:
version: "1.0.12"
baseurl: "https://github.com/RafaySystems/gaap-rack/releases/download"
modeldb:
image:
repository: registry.dev.rafay-edge.net/rafay/gaap-inference-modeldb
tag: "main-182"
capi:
image:
repository: registry.dev.rafay-edge.net/rafay/cluster-api-provider-gaapx
tag: "main-2"
scheduler:
replicaCount: 1
resources:
limits:
cpu: 1000m
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
GenAI Console Endpoints¶
The following endpoints are available after deployment:
https://console-genai.paas.rafay.dev/https://ops-console-genai.paas.rafay.dev/