Certificate Rotation
Overview¶
Managing certificates within MKS Clusters is critical for maintaining operational continuity and ensuring security. Each cluster relies on a set of certificates with distinct expiration dates, necessitating proactive management to avoid service disruptions or security vulnerabilities. Rafay's certificate rotation feature addresses these challenges by automating the periodic renewal of credentials before they expire. This process involves scheduling maintenance windows to safely update client and server certificates used for Kubernetes operations, including authentication between components such as kubelets, API servers, and administrators. By streamlining certificate management, the product enhances cluster reliability and security, ensuring seamless operation without compromising uptime or performance.
In addition to automated rotations, users have the option to perform manual certificate rotations, especially if a cluster remains un-upgraded for extended periods, such as over a year. During these manual rotations, nodes are recreated to implement the new credentials, which can disrupt ongoing workloads. Therefore, it is crucial to schedule these rotations during maintenance windows to avoid unexpected downtime or unresponsive API clients. If users do not manually rotate the certificates within the scheduled timeframe, the system will automatically perform the rotation to maintain cluster security and operability, ensuring minimal disruption and continued protection.
Security Through Timely Certificate Rotation¶
Certificate rotation is essential both before certificates expire and in the event of a certificate compromise. Usually, the certificate rotation happens systematically when a user upgrades their cluster.However, some users might opt out of upgrading their clusters due to application incompatibility or other business considerations. This can eventually lead to certificate expiration, impacting cluster functionality. In such cases, automatic certificate rotation helps maintain security without requiring user intervention. For users who need to perform certificate rotation before the expiry date due to security or vulnerability concerns, users can prefer Manual Rotation option. This allows users to proactively address certificate-related issues as needed, ensuring they can maintain high security standards and mitigate risks associated with compromised or expiring certificates.
Warning
- For all certificate rotation types, the certificate rotations may result in the cluster being temporarily unavailable as components are restarted. It is recommended to perform this action during a maintenance window for production environments. Additionally, for the Kubernetes-CA certificate type, after rotating the Kubernetes CA, restart any workload or operator pods that interact with the Kubernetes API server.
- Certificate rotation is not supported for Windows worker nodes in the clusters.`
Certificates Required to Create a Kubernetes Cluster¶
In a Kubernetes cluster, various certificates (given below) are used to ensure secure communication between different components.
Kubernetes CAs¶
- Apiserver.crt
- Apiserver.key
- Apiserver-kubelet-client.crt
- Apiserver-kubelet-client.key
ETCD CAs¶
- Healthcheck-client.crt
- Healthcheck-client.key
- Apiserver-etcd-client.crt
- Apiserver-etcd-client.key
Front Proxy CAs¶
- Front-proxy-client.crt
- Front-proxy-client.key
Important
CA certificates have a 10-year expiry, while leaf certificates expire after one year.
Pre-requisites for Cert Rotation¶
- For manual rotation of any cluster certificate type, ensure the cluster is healthy.
- For automatic rotation, the backend will check if a few pods are up before initiating certificate rotation.
Automatic Certificate Rotation¶
The automatic certificate rotation feature ensures seamless management of certificates within MKS Clusters. Approximately 60 days before a certificate's expiration date, a system-generated email is sent to users, notifying them of the impending expiration and the scheduled rotation date. This proactive notification allows users to prepare for the certificate update process, which occurs systematically on the specified expiration date. It's important to note that this notification email is sent only once, 60 days prior to the certificate's expiration. The notification email mentions that the system will automatically initiate the certificate rotation process before 30 days of expiry, allowing users ample time to plan and execute the rotation without unexpected disruptions. By automating this critical process, our system simplifies certificate management, enhances security, and maintains operational continuity across MKS Clusters.
Manual Certificate Rotation¶
Users have the option to manually perform certificate rotation via the Console interface, offering flexibility and control over certificate management within MKS Clusters. This process enables users to address specific timing needs or urgent security requirements. Follow the steps below to initiate the certificate rotation process manually via the console.
- Select the required upstream cluster from the cluster dashboard and click on the setting icon
- Click View/Rotate Certificates
The View/Rotate Certificate screen appears
View Certificate¶
Here, users can view the list of CA certificates and Control Plane Certificates (Leaf Certificates) associated with the selected upstream cluster, including their certificate creation date and expiry date.
Note: If nodes are not reachable, certificates cannot be fetched.
Certification Rotation Types¶
Clicking on Rotate shows four (4) types of Certificate Rotation:
- Kubernetes-Leaf
- ETCD-Leaf
- ETCD-CA
- Kubernetes-CA
Select the required cert rotation type and click Rotate Certificate
Rotating the Kubernetes CA certificate will automatically rotate the Kubernetes leaf certificate. Similarly, rotating the ETCD CA certificate will automatically rotate the ETCD leaf certificates.
Cert Rotation Activity¶
Kubernetes-CA Cert Rotation is initiated on the control plane nodes and then on the worker nodes. Other Cert Rotations will only be applied on the control plane nodes. All these activities are tracked in the activity tab. Once the cert rotation is initiated, click on Activity to view the rotation logs as shown below.
Cert Rotation Status
- Initiated: Displays the time when certificate rotation was initiated, along with the Node ID where the rotation is performed.
- Success: Displays the time when certificate rotation was completed, along with the Node ID where the rotation is completed.
- Failed: Displays the time of certificate rotation failure, the Node ID where the rotation was attempted, and the reason for the failure.
Once the Kubernetes CAs certificate rotation is completed, Rafay pods will automatically restart in the backend. Users should allow sufficient time for the clusters to become healthy
Alerts¶
When certificates approach expiration within 60 days, the system monitors the expiry dates of all available certificates, triggering alerts as shown below and initiates the renewal process as necessary. Users can view the entire history of the cert rotation via System Audit Log
Post Cert Rotation Actions
For Kubernetes CAs certificate rotation, if any workload is communicating with the API service, users need to restart their workloads or webhooks
Watch this video for a demonstration on Certificate Rotation