Skip to content

GenAI Services Setup

This section outlines the steps required to install and configure the GenAI services stack on a GPU PaaS controller. The setup deploys the GAAP controller, supporting services, AI gateway components, and the dependencies required for running GenAI workloads.

Install Helm

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
````

## Configure PostgreSQL for Temporal

Edit the PostgreSQL configuration in the `rafay-core` namespace:

```bash
kubectl edit postgresqls -n rafay-core postgres-admin

Add the following databases and users:

databases:
  temporal: temporaldbuser
  temporal_visibility1: temporalvdbuser
users:
  temporaldbuser: []
  temporalvdbuser: []

Set OpenEBS HostPath as the Default StorageClass

kubectl patch storageclass openebs-hostpath \
  -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Update Ops Console Routing

Remove the /api route from the Ops Console VirtualService:

kubectl edit vs -n istio-system opsconsole-ingress-vs

Add the GAAP Controller Helm Repository

helm repo add gaap-controller https://rafaysystems.github.io/gaap-controller/
helm repo update

Install the GAAP Controller

helm install gaap-controller gaap-controller/gaap-controller  --version 0.3.25 -n rafay-core -f values.yaml

Upgrade Controller Images (Optional)

helm upgrade gaap-controller gaap-controller/gaap-controller --set
fullnameOverride=gaap-controller -n rafay-core --version 0.3.25 -f values.yaml

values.yaml for GPU PaaS Deployment

This values file includes reduced resource requests and the required GenAI service components.

replicaCount: 1
image:
  repository: registry.dev.rafay-edge.net/rafay/gaap-controller
  pullPolicy: IfNotPresent
  tag: "latest"
tokenExchangeConfig:
  image:
    repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
    tag: "main-38"
  replicaCount: 1
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
idpConfig:
  image:
    repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
    tag: "main-38"
  replicaCount: 1
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
authProxyConfig:
  image:
    repository: registry.dev.rafay-edge.net/rafay/ai-auth-proxy
    tag: "main-273"
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
aisrvConfig:
  image:
    repository: registry.dev.rafay-edge.net/rafay/titan-core-controller
    tag: "main-38"
  replicaCount: 1
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi
  images:
    analyzer: registry.dev.rafay-edge.net/finetune:v9-compressor
    evaluation: registry.dev.rafay-edge.net/finetune:v9-compressor
    inference: registry.dev.rafay-edge.net/rafay/triton:v19-openai
    quantization: registry.dev.rafay-edge.net/finetune:v9-compressor
    training: registry.dev.rafay-edge.net/finetune:v9-compressor
    operator: registry.dev.rafay-edge.net/rafay/titan-core-cluster:main-38
    aig_extproc: registry.dev.rafay-edge.net/rafay/titan-core-extproc:main-38
    envoy_gateway: docker.io/envoyproxy/gateway:v1.6.0
    envoy_ratelimit: docker.io/envoyproxy/ratelimit:99d85510
    ai_gateway_controller: docker.io/envoyproxy/ai-gateway-controller:v0.4.0
    rafay_csi_driver: registry.dev.rafay-edge.net/dev/ai-repo-csi-driver:20251118012037
compute:
  images:
    repository: registry.dev.rafay-edge.net/rafay/gaap-compute-operator
    tag: "main-47"
syncer:
  images:
    repository: registry.dev.rafay-edge.net/rafay/titan-core-cluster
    tag: "main-38"
rack:
  version: "1.0.12"
  baseurl: "https://github.com/RafaySystems/gaap-rack/releases/download"

modeldb:
  image:
    repository: registry.dev.rafay-edge.net/rafay/gaap-inference-modeldb
    tag: "main-182"
capi:
  image:
    repository: registry.dev.rafay-edge.net/rafay/cluster-api-provider-gaapx
    tag: "main-2"
scheduler:
  replicaCount: 1
  resources:
    limits:
      cpu: 1000m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 128Mi

GenAI Console Endpoints

The following endpoints are available after deployment:

  • https://console-genai.paas.rafay.dev/
  • https://ops-console-genai.paas.rafay.dev/