Self Service¶

November 29, 2025
in Product Blog, Run:AI, Self Service, GPU Cloud
4 min read

How GPU Clouds can deliver Run:AI via Self Service using Rafay GPU PaaS

As the demand for AI training and inference surges, GPU Clouds are increasingly looking to offer higher-level, turnkey AI services—not just raw GPU instances. Some customers may be familiar with Run:AI from Nvidia as a AI workload orchestration and optimization platform. Delivering Run:AI as a scalable, repeatable SKU—something customers can select and provision with a few clicks—requires deep automation, lifecycle management, and tenant isolation capabilities. This is exactly what Rafay provides.

With Rafay, GPU Clouds can deliver Run:AI as a self-service SKU, ensuring customers receive a fully configured Run:AI environment—complete with GPU infrastructure, a Kubernetes cluster, necessary operators, and a ready-to-use Run:AI tenant—all deployed automatically. This blog explains how Rafay enables cloud providers to industrialize Run:AI provisioning into a consistent, production-ready SKU.

Run:AI via Self Service

November 2, 2025
in Product Blog, Fractional GPU, Self Service, Nvidia, KAI Secheduler
4 min read

Self-Service Fractional GPU Memory with Rafay GPU PaaS

In Part-1, we explored how Rafay GPU PaaS empowers developers to use fractional GPUs, allowing multiple workloads to share GPU compute efficiently. This enabled better utilization and cost control — without compromising isolation or performance.

In Part-2, we will show how you can enhance this by provide users the means to select fractional GPU memory. While fractional GPUs provide a share of the GPU’s compute cores, different workloads have dramatically different GPU memory needs. With this update, developers can now choose exactly how much GPU memory they want for their pods — bringing fine-grained control, better scheduling, and cost efficiency.

Fractional GPU Memory

November 1, 2025
in Product Blog, Fractional GPU, Self Service, Nvidia, KAI Secheduler
6 min read

Self-Service Fractional GPUs with Rafay GPU PaaS

Enterprises and GPU Cloud providers are rapidly evolving toward a self-service model for developers and data scientists. They want to provide instant access to high-performance compute — especially GPUs — while keeping utilization high and costs under control.

Rafay GPU PaaS enables enterprises and GPU Clouds to achieve exactly that: developers and data scientists can spin up resources such as Developer Pods or Jupyter Notebooks backed by fractional GPUs, directly from an intuitive self-service interface.

This is Part-1 in a multi-part series on end user, self service access to Fractional GPU based AI/ML resources.

Fractional GPU

June 20, 2025
in Product Blog, Slinky, SLURM, Kubernetes, Self Service
4 min read

Self-Service Slurm Clusters on Kubernetes with Rafay GPU PaaS

In the previous blog, we discussed how Project Slinky bridges the gap between Slurm, the de facto job scheduler in HPC, and Kubernetes, the standard for modern container orchestration.

Project Slinky and Rafay’s GPU Platform-as-a-Service (PaaS) combined provide enterprises and cloud providers with a transformative combination that enables secure, multi-tenant, self-service access to Slurm-based HPC environments on shared Kubernetes clusters. Together, they allow cloud providers and enterprise platform teams to offer Slurm-as-a-Service on Kubernetes—without compromising on performance, usability, or control.

Design

April 18, 2025
in Product Blog, AWS SageMaker AI, Domains, User Profiles, Self Service
4 min read

End-User Self-Service for Automated User Profile Creation in SageMaker Domains

As organizations expand their use of Amazon SageMaker to empower data scientists and machine learning (ML) engineers, managing access to development environments becomes a critical concern. In the last blog, we discussed how SageMaker Domains can provide isolated, secure, and fully-featured environments for users.

However, manually creating user profiles for every user quickly becomes a bottleneck—especially in large or fast-growing organizations. Asking users to submit an IT ticket and wait for days before it can be fulfilled is unacceptable in today's fast paced environment.

In this blog, we will describe how organizations use Rafay's GPU PaaS to provide their users with a self-service experience to onboard themselves into SageMaker Domains without waiting on IT or platform teams. This not only improves efficiency and user experience but also ensures consistency and compliance across the organization.

SageMaker AI Self Service

March 7, 2025
in Product Blog, DeepSeek, Amazon EKS, Auto Mode, Environment Manager
4 min read

Developer Self Service Access to DeepSeek on Amazon EKS

A few weeks back, Tiago Reichert from AWS published a very interesting blog on AWS Community showcasing how you can deploy and use the DeepSeek-R1 LLM on an Amazon EKS Cluster operating in Auto Mode. Detailed step-by-step instructions for this are documented in this Git Repo.

In this blog, we will describe how we took AWS's excellent blog and packaged it to provide a turnkey, 1-click self-service experience for non AWS administrator type users in a typical enterprise. It took one of our solution architects 30 minutes to wrap AWS's example code using Rafay's Environment Manager and PaaS.

Over the last few weeks, we have been asked to demonstrate this every day to several customers and partners. Given the significant interest in DeepSeek and the self service experience, we believe others will benefit from this blog.