DRA¶

October 23, 2025
in Product Blog, Kubernetes v1.34, GPU Resource Management, DRA, AI/ML
6 min read

GPU Resource Management in Kubernetes: From Extended Resource to DRA

This blog is part of our DRA series, continuing from our earlier posts: Introduction to DRA, Enabling DRA with Kind, and MIG with DRA . This post focuses on pre-DRA vs post-DRA GPU management on Rafay upstream Kubernetes clusters.

October 6, 2025
in DRA, Kubernetes, Get Started, MKS
8 min read

Dynamic Resource Allocation for GPU Allocation on Rafay's MKS (Kubernetes 1.34)

This blog demonstrates how to leverage Dynamic Resource Allocation (DRA) for efficient GPU allocation using Multi-Instance GPU (MIG) strategy on Rafay's Managed Kubernetes Service (MKS) running Kubernetes 1.34.

In our previous blog series, we covered various aspects of Dynamic Resource Allocation (DRA) in Kubernetes:

Introduction to Dynamic Resource Allocation (DRA) in Kubernetes — what it is and why it matters
Enable Dynamic Resource Allocation (DRA) in Kubernetes — configuring DRA on a Kubernetes 1.34 cluster using kind
Deploy Workload using DRA ResourceClaim/ResourceClaimTemplate in Kubernetes — deploying and managing DRA workloads natively on Kubernetes

DRA is GA in Kubernetes 1.34

With Kubernetes 1.34, Dynamic Resource Allocation (DRA) is Generally Available (GA) and enabled by default on MKS clusters. This means you can immediately start using DRA features without additional configuration.

Prerequisites

Before we begin, ensure you have:

A Rafay MKS cluster running Kubernetes 1.34 (see MKS v1.34 Blog)
GPU nodes with compatible NVIDIA GPUs (A100, H100, or similar MIG-capable GPUs)
Container Device Interface (CDI) enabled (automatically enabled in MKS for Kubernetes 1.34)
Basic understanding of Dynamic Resource Allocation concepts (covered in our previous blog series)
Active Rafay account with appropriate permissions to manage MKS clusters and addons

September 16, 2025
in DRA, Kubernetes, Get Started, Deply Workloads
5 min read

Deploy Workload using DRA ResourceClaim in Kubernetes

In the first blog in the DRA series, we introduced the concept of Dynamic Resource Allocation (DRA) that recently went GA in Kubernetes v1.34 which was released end of August 2025.

In the second blog, we installed a Kuberneres v1.34 cluster and deployed an example DRA driver on it with "simulated GPUs". In this blog, we’ll will deploy a few workloads on the DRA enabled Kubernetes cluster to understand how "Resource Claim" and "ResourceClaimTemplates" work.

Info

We have optimized the steps for users to experience this on their laptops in less than 5 minutes. The steps in this blog are optimized for macOS users.

August 28, 2025
in DRA, Kubernetes, Get Started
6 min read

Enable Dynamic Resource Allocation (DRA) in Kubernetes

In the previous blog, we introduced the concept of Dynamic Resource Allocation (DRA) that just went GA in Kubernetes v1.34 which was released in August 2025.

In this blog post, we’ll will configure DRA on a Kubernetes 1.34 cluster.

Info

We have optimized the steps for users to experience this on their macOS or Windows laptops in less than 15 minutes. The steps in this blog are optimized for macOS users.

August 23, 2025
in DRA, Kubernetes
4 min read

Introduction to Dynamic Resource Allocation (DRA) in Kubernetes

In the previous blog, we reviewed the limitations of Kubernetes GPU scheduling. These often result in:

Resource fragmentation – large portions of GPU memory remain idle and unusable.
Topology blindness – multi-GPU workloads may be scheduled suboptimally.
Cost explosion – teams overprovision GPUs to work around scheduling inefficiencies.

In this post, we’ll look at how a new GA feature in Kubernetes v1.34 — Dynamic Resource Allocation (DRA) — aims to solve these problems and transform GPU scheduling in Kubernetes.

August 20, 2025
in DRA, Kubernetes
4 min read

Rethinking GPU Allocation in Kubernetes

Kubernetes has cemented its position as the de-facto standard for orchestrating containerized workloads in the enterprise. In recent years, its role has expanded beyond web services and batch processing into one of the most demanding domains of all: AI/ML workloads.

Organizations now run everything from lightweight inference services to massive, distributed training pipelines on Kubernetes clusters, relying heavily on GPU-accelerated infrastructure to fuel innovation.

But there’s a problem. In this blog, we will explore why the current model falls short, what a more advanced GPU allocation approach looks like, and how it can unlock efficiency, performance, and cost savings at scale.