01 Aug

Info

GPU PaaS releases are initially rolled out via Rafay's Air Gapped Controller form factor. These will be periodically bundled and rolled out into Rafay's Production SaaS.

v3.1-33¶

01 Aug, 2025

Dynamic Loading for Compute Catalog¶

The compute catalog records and now loaded dynamically and with support for pagination. This provides users of the "end user self service portal" with a snappy and highly responsive user experience.

KYC Hold for Customer Orgs¶

Administrators can optionally enable a setting where customer orgs that are created via self service signups are automatically placed in a Know Your Customer (KYC) holding pattern. Until the KYC hold is removed, these users will be blocked from launching and using compute and applications in the new org/tenant.

White Labeling Infra Portal¶

In addition to the end user facing Developer Hub and the PaaS Studio which already supported white labelong, the Infra portal meant to be used by Org Admins can also be white labeled. See image below for an illustrative example.

Redeploy Compute Instances¶

Previously, updating the resources of an already deployed compute instance—such as increasing a VM’s vCPUs from 2 to 4—required users to create a new VM and terminate the existing one, resulting in avoidable downtime. With this release, users can now modify resources on existing instances without redeployment.

API Retrieval Enhancements¶

The REST API to retrieve Compute and Service instances has been enhanced to retrieve using the ID as a filter.

SKU Usage Metrics¶

Admins who create and curate SKUs (i.e. compute and service profiles) using the Rafay PaaS Studio can now use the Usage Metrics dashboard to visualize usage by SKU for the last 30 days. Admins can use the Usage Metrics dashboard to:

Understand how resources are being consumed.
Take action on cost optimization opportunities.
Improve operational visibility.
Ensure compliance and auditability across infrastructure.

It is meant to be a single-pane-of-glass view into how efficiently the platform is running — and where adjustments may be needed.

📈 Monitor System Utilization Trends

Total Usage Hours, Compute Usage, and Service Usage provide a quick health check of system activity over the last 30 days. Admins use these metrics to identify underutilization or spikes in demand, enabling proactive scaling or resource optimization.

🖥️ Track High-Usage Instances

The Compute Instances Table, sorted by usage, helps identify which VMs are consuming the most hours. With this, admins can:

Detect potentially idle but costly VMs.
Flag long-running jobs for audit or optimization.
Justify scale-up or scale-down decisions.
🛠️ Manage Lifecycle Events (Created / Deleted Instances)

The Created at and Deleted at timestamps give visibility into provisioning and deprovisioning patterns. With this admins can: - Reclaim resources from deleted instances. - Cross-check deletion dates with project activity for compliance. - Trace back deleted workloads if needed for diagnostics.

🧩 Audit by Profile Types and Projects

View Profile Types to analyze diversity in workload types (e.g., GPU, K8s). They can filter by Project to drill deeper. This helps with:

Budget attribution across teams.
Forecasting demand by profile (e.g., needing more small-gpu-vm profiles).
Enforcing usage policies.
🔍 Quick Troubleshooting and Reporting

Use the search bar to find specific instances by name or ID. They can export or snapshot usage data for reporting to finance, compliance teams.

Controller Disaster Recovery Enhancements¶

The air-gapped Rafay Controller has been enhanced to support backup and restore for disaster recovery using MinIO (an S3-compatible object storage solution). This allows backups to be securely stored and restored from MinIO instances hosted on the provider’s local infrastructure, ensuring rapid recovery in the event of system failures or data loss.

Serverless Pods as a Service¶

A new type of compute (Serverless Pods-as-a-Service) is now available for providers to offer their end users. Users can request and use environments via Rafay's self service portal by specifying "OS, container image, GPU, and SSH keys". The end user is then provided with seamless SSH and HTTPS web access to the deployed environments.

Compute¶

End users can configure and launch serverless pods by selecting from configurations available to them. The image below shows an example where the service provider has configured the "compute profile" to allow/disallow certain options for serverless pods. For example, this provider supports Nvidia A40 GPUs.

Once the serverless pod has been succesfully deployed, end users can access it via SSH from their laptops. In the image below, the user had selected autogeneration of SSH keypair. So, they are being provided a way to download the private key to their laptops and then access it using a SSH command.

Notebooks¶

Users can also deploy and use Jupyter Notebooks hosted on serverless pod based compute. In the example image below, the provider has configured the "paas profile" to allow the end user to select from a few options i.e.

How many GPUs
Type of GPU
Notebook Profile (i.e. comes bundled with specific tooling for AI/ML)

Once the notebook has been successfully deployed on the provider's infrastructure, the end user can access and use it using a web browser. In the image below, the end user is shown the following:

The URL where the Notebook can be accessed
The access token for the notebook

Info

Click here to learn more about this end user facing service.