Skip to content

Fractional GPUΒΆ

Self-Service Fractional GPUs with Rafay GPU PaaS

Enterprises and GPU Cloud providers are rapidly evolving toward a self-service model for developers and data scientists. They want to provide instant access to high-performance compute β€” especially GPUs β€” while keeping utilization high and costs under control.

Rafay GPU PaaS enables enterprises and GPU Clouds to achieve exactly that: developers and data scientists can spin up resources such as Developer Pods or Jupyter Notebooks backed by fractional GPUs, directly from an intuitive self-service interface.

This is Part-1 in a multi-part series on end user, self service access to Fractional GPU based AI/ML resources.

Fractional GPU

Fractional GPUs using Nvidia's KAI Scheduler

At KubeCon Europe, in April 2025, Nvidia announced and launched the Kubernetes AI (KAI) Scheduler. This is an Open Source project maintained by Nvidia.

The KAI Scheduler is an advanced Kubernetes scheduler that allows administrators of Kubernetes clusters to dynamically allocate GPU resources to workloads. Users of the Rafay Platform can immediately leverage the KAI scheduler via the integrated Catalog.

KAI in Catalog

To help you understand the basics quickly, we have also created a brief video introducing the concepts and a live demonstration showcasing how you can allocate fractional GPU resources to workloads.