v1.1.47 - Terraform Provider¶

20 May, 2025

This update of the Terraform provider includes the following improvements/bug fixes.

Bug Fixes¶

Bug ID	Description
RC-36171	Improved error messaging for the rafay_download_kubeconfig Terraform resource
RC-41733	Resolved an issue where terraform apply would display a diff in spec.variables for static resources, even when there were no actual changes
RC-41568	Fixed an issue where re-applying Terraform for resource templates with destroy OpenTofu hooks resulted in an error
RC-41128	Resolved issues with Terraform flatteners where changes to the working directory path and service account name in workflow handler specs were not detected during terraform apply --refresh-only after manual updates via the UI
RC-41035	Added validation to prevent the creation of environment templates with invalid schedule task types or agent override types
RC-40878	Enhanced clarity of error messages related to hooks in resource and environment templates
RC-40465	Resolved issue where Terraform re-apply showed diffs in certain sensitive values for the Workflow Handler
RC-40556	Fixed incorrect diffs appearing during terraform plan for Cloud Credentials v3 resources
RC-35451	Resolved issue preventing simultaneous blueprint updates and label additions to imported clusters in a single Terraform apply
RC-40669	AKS: Fixed an issue where Terraform failed with the error 'Day 2 operation SystemPlacementUpdation is not allowed because Cluster is not in running state' when attempting to start the cluster.

v3.4 Update 1 - SaaS¶

20 May, 2025

The following issues related to Cluster Sharing have been addressed in this release.

Bug ID	Description
RC-42007	EKS: Addressed Terraform apply failure when creating a cluster with sharing enabled
RC-42073	EKS: Addressed controller backend not correctly updated on removing the sharing block from the cluster spec
RC-40459	AKS: Resolved an issue with honoring the configuration in 'rafay_cluster_sharing_single' TF resource
RC-41818	GKE: Resolved an issue with honoring the configuration in 'rafay_cluster_sharing_single' TF resource

Documentation Update¶

A new doc is now available to help users correctly configure and manage cluster sharing across projects using the Terraform provider.Cluster Sharing Best Practices
This documentation outlines supported patterns and best practices and how to avoid configuration conflicts when using rafay_cluster_sharing, rafay_cluster_sharing_single, or embedded sharing blocks.

v3.4 - SaaS¶

04 May, 2025

Upstream Kubernetes for Bare Metal and VMs¶

Enhanced Bulk Add/Delete Response in RCTL¶

Enhancements have been made to the RCTL experience when adding or deleting nodes in Upstream Kubernetes clusters.

Previously, when performing bulk node operations (add/delete), the response from RCTL did not clearly indicate which nodes were impacted. With this enhancement, the response message now includes the names of the affected nodes, providing users with clearer visibility and traceability.

This helps users confirm exactly which nodes were successfully added or removed from their upstream cluster.

Sample Response¶

Below is an example of the enhanced response returned when performing a bulk operation using RCTL:

{
  "taskset_id": "9dk3emn",
  "operations": [
    {
      "operation": "NodeAddition",
      "resource_name": "test-21",
      "status": "PROVISION_TASK_STATUS_PENDING"
    },
    {
      "operation": "BulkNodeUpdate",
      "resource_name": "test-21",
      "status": "PROVISION_TASK_STATUS_PENDING"
    },
    {
      "operation": "BulkNodeDelete",
      "resource_name": "test-21",
      "status": "PROVISION_TASK_STATUS_PENDING"
    }
  ],
  "comments": "Node add operation will be performed on: test-44, test-45. Node delete operation will be performed on: test-43. Node update operation will be performed on: test-41. The status of the operations can be fetched using taskset_id",
  "status": "PROVISION_TASKSET_STATUS_PENDING"
}

Preflight Checks for MKS Clusters via RCTL¶

As part of provisioning clusters using RCTL with a cluster configuration file, users can now initiate MKS-specific Conjurer preflight checks directly using a new command-line flag.

Use the following command to invoke conjurer preflights during cluster provisioning:

./rctl apply -f <cluster.yaml> --mks-prechecks

Sample Response¶

Running Preflight-Check command on node: mks-node-5 (x.x.x.x)
Ubuntu detected. Checking and installing bzip2 if necessary...
[+] Performing pre-tests
    [+] Operating System check
    [+] CPU check
    [+] Memory check
    [+] Internet connectivity check
    [+] Connectivity check to rafay registry
    [+] DNS Lookup to the controller
    [+] Connectivity check to the Controller
    [+] Multiple default routes check
    [+] Time Sync check
    [+] Storage check
         Detected device: /dev/loop0, mountpoint: /snap/core18/2829, type: loop, size: 55.7M, fstype: null
         Detected device: /dev/loop1, mountpoint: /snap/oracle-cloud-agent/72, type: loop, size: 77.3M, fstype: null
         Detected device: /dev/loop2, mountpoint: /snap/snapd/21759, type: loop, size: 38.8M, fstype: null
         Detected device: /dev/sda, mountpoint: null, type: disk, size: 46.6G, fstype: null
         Detected device: /dev/sda1, mountpoint: /, type: part, size: 45.6G, fstype: ext4
         Detected device: /dev/sda14, mountpoint: null, type: part, size: 4M, fstype: null
         Detected device: /dev/sda15, mountpoint: /boot/efi, type: part, size: 106M, fstype: vfat
         Detected device: /dev/sda16, mountpoint: /boot, type: part, size: 913M, fstype: ext4
         Detected device: /dev/sdb, mountpoint: null, type: disk, size: 150G, fstype: null
         Potential storage device: /dev/sdb
    [+] Hostname underscore check
    [+] DNS port check
    [+] Nameserver Rotate option check for /etc/resolv.conf

[+] Checking for Warnings

[+] Checking for Fatal errors

[+] Checking for hard failures


-------------------------------------
Preflight-Checks ran successfully on 1 node
mks-node-5 (129.146.83.94)

Note

To leverage this flag, please download the latest RCTL binary. The --mks-prechecks flag is supported in the latest version.

For more information about this feature, click here.

Enhancement: Improved DNS Configuration Resilience¶

We have enhanced the handling of DNS configuration to provide better stability and resilience.

What Changed¶

Handling of rotate Flag in resolv.conf
When the rotate option is enabled in resolv.conf, DNS requests are round-robined across available servers, which can lead to intermittent discovery failures for Consul.
- We now warn users when the rotate flag is detected (via the conjurer binary).
- During day-2 operations, the system will automatically remove the rotate flag** to ensure consistent DNS resolution and avoid cluster issues.

These enhancements help avoid DNS-related instability and ensure reliable and consistent service discovery in upstream Kubernetes clusters, even in dynamic or custom network environments.

Note

This enhancement is applicable only for newly created upstream MKS clusters. Support for existing clusters will be included in the next release.

Backup & Restore¶

Agent update¶

This enhancement enables users to upgrade an existing Data Agent from version v1.9.3 to v1.15.1.
The upgrade can be performed using either the UI or RCTL.

As part of the upgrade to v1.15.1, users can optionally enable new capabilities introduced in this version, including:

Enable CSI: Enables support for volume snapshots using the Container Storage Interface (CSI).
SSE-C Encryption Key: Allows users to configure server-side encryption with customer-provided keys for enhanced data security.

These features are available only in version v1.15.1 and are not supported in earlier version.

For more information about this feature, click here.

Amazon EKS¶

Day 2 Tag Updates for Self-Managed Node Groups¶

Previously, updating tags on self-managed node groups during Day 2 operations was not supported. Attempting to modify or add tags would result in a validation error, preventing the update from being applied.

With this release, users can now update tags on self-managed node groups as part of Day 2 operations, offering improved flexibility and lifecycle management for EKS clusters.

Note

As part of this Day 2 tag update, the nodes will be recycle as the new launch template with the updated tags will be created and applied internally to all resources, resulting in the replacement of existing nodes without requiring manual intervention.

Platform as a Service (PaaS)¶

Info

The PaaS feature is feature flagged. The enhancements described below are available to customer orgs only if the feature flag is enabled.

Partner-Level Dashboards¶

Partner-level dashboards that will offer insights into various usage metrics, with the ability to filter data by organization will be available. These dashboards can help answer questions such as:

How many SKUs or profiles have been created?
How many instances have been launched, by whom, and under which organizations?
Which profiles are most frequently used for instance creation?
Are there any instances currently in a failed or unhealthy state?
When were instances created, and how long have they been running?
What are the usage trends over time for instance creation?
Who are the most active users across organizations?

For more information about this feature, click here.

Metrics for VM instances¶

The platform currently provides utilization metrics for instances based on Kubernetes clusters. Similar metrics—such as GPU and memory utilization are being extended to support VM-based instances as well. Users can filter by time range and view historical metrics going back up to one week.

User Management¶

Authentication¶

Several enhancements are being implemented to strengthen authentication, including stricter policies that prevent local users from reusing previously used passwords.

Namespaces¶

Label Associations¶

A previous release introduced support for configuring namespace labels at the project level. Upcoming enhancements focus on performance optimization and improvements to the reconciliation loop for more efficient and reliable label management.

Environment Manager¶

Skip Resource Execution¶

This enhancement enables selective execution of specific resources during an environment deployment. It is particularly useful in cases where only certain resources such as DNS updates for failover require changes, allowing all other unchanged resources to be skipped from execution.

Note

This feature will initially be supported with non-UI interfaces. Support with UI interface will be added in a subsequent release.

For more information about this feature, click here.

Helm App Catalog¶

The Helm App Catalog has been updated to add support for the following repositories.

Category	Description
AI/ML	AMD GPU Operator

For more information about this feature, click here

Bug Fixes¶

Bug ID	Description
RC-41391	Fixed an issue where applying a blueprint update failed with a `CNI NotFound` error.
RC-41275	Resolved a failure in EKS cluster creation when using the `Ubuntu2204` AMI family.
RC-40459	Upstream Kubernetes: Fixed an issue where the node page navigation would hang and freeze.