Skip to content

Self Service

Self-Service Fractional GPU Memory with Rafay GPU PaaS

In Part-1, we explored how Rafay GPU PaaS empowers developers to use fractional GPUs, allowing multiple workloads to share GPU compute efficiently. This enabled better utilization and cost control — without compromising isolation or performance.

In Part-2, we will show how you can enhance this by provide users the means to select fractional GPU memory. While fractional GPUs provide a share of the GPU’s compute cores, different workloads have dramatically different GPU memory needs. With this update, developers can now choose exactly how much GPU memory they want for their pods — bringing fine-grained control, better scheduling, and cost efficiency.

Fractional GPU Memory

Self-Service Fractional GPUs with Rafay GPU PaaS

Enterprises and GPU Cloud providers are rapidly evolving toward a self-service model for developers and data scientists. They want to provide instant access to high-performance compute — especially GPUs — while keeping utilization high and costs under control.

Rafay GPU PaaS enables enterprises and GPU Clouds to achieve exactly that: developers and data scientists can spin up resources such as Developer Pods or Jupyter Notebooks backed by fractional GPUs, directly from an intuitive self-service interface.

This is Part-1 in a multi-part series on end user, self service access to Fractional GPU based AI/ML resources.

Fractional GPU

Self-Service Slurm Clusters on Kubernetes with Rafay GPU PaaS

In the previous blog, we discussed how Project Slinky bridges the gap between Slurm, the de facto job scheduler in HPC, and Kubernetes, the standard for modern container orchestration.

Project Slinky and Rafay’s GPU Platform-as-a-Service (PaaS) combined provide enterprises and cloud providers with a transformative combination that enables secure, multi-tenant, self-service access to Slurm-based HPC environments on shared Kubernetes clusters. Together, they allow cloud providers and enterprise platform teams to offer Slurm-as-a-Service on Kubernetes—without compromising on performance, usability, or control.

Design

End-User Self-Service for Automated User Profile Creation in SageMaker Domains

As organizations expand their use of Amazon SageMaker to empower data scientists and machine learning (ML) engineers, managing access to development environments becomes a critical concern. In the last blog, we discussed how SageMaker Domains can provide isolated, secure, and fully-featured environments for users.

However, manually creating user profiles for every user quickly becomes a bottleneck—especially in large or fast-growing organizations. Asking users to submit an IT ticket and wait for days before it can be fulfilled is unacceptable in today's fast paced environment.

In this blog, we will describe how organizations use Rafay's GPU PaaS to provide their users with a self-service experience to onboard themselves into SageMaker Domains without waiting on IT or platform teams. This not only improves efficiency and user experience but also ensures consistency and compliance across the organization.

SageMaker AI Self Service

Developer Self Service Access to DeepSeek on Amazon EKS

A few weeks back, Tiago Reichert from AWS published a very interesting blog on AWS Community showcasing how you can deploy and use the DeepSeek-R1 LLM on an Amazon EKS Cluster operating in Auto Mode. Detailed step-by-step instructions for this are documented in this Git Repo.

In this blog, we will describe how we took AWS's excellent blog and packaged it to provide a turnkey, 1-click self-service experience for non AWS administrator type users in a typical enterprise. It took one of our solution architects 30 minutes to wrap AWS's example code using Rafay's Environment Manager and PaaS.

Over the last few weeks, we have been asked to demonstrate this every day to several customers and partners. Given the significant interest in DeepSeek and the self service experience, we believe others will benefit from this blog.

Rafay, DeepSeek and EKS