Monitor

Monitor Ceph Cluster¶

In this part, you will setup and access the Ceph Dashboard to monitor and manage the Ceph Cluster.

Step 1: Install Prometheus Operator¶

In order for the Ceph Dashboard to be used to it's full potential and display cluster metrics, Prometheus is needed to pass those metrics to the dashboard.

In this step, you will install the Prometheus Operator into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.

Navigate to Infrastructure -> Add-Ons
Click New Add-On -> Create New Add-On
Enter the name prometheus-operator
Select K8s YAML for the type
Select Upload files manually
Select the rook-ceph namespace
Click Create

Click New Version
Enter v1 for the version name
Save the YAML file located HERE to your local machine
Click Upload and select the previously saved YAML file
Click Save Changes

Step 2: Install Prometheus Server¶

In this step, you will install a Prometheus Server into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.

Navigate to Infrastructure -> Add-Ons
Click New Add-On -> Create New Add-On
Enter the name prometheus-server
Select K8s YAML for the type
Select Upload files manually
Select the rook-ceph namespace
Click Create

Click New Version
Enter v1 for the version name
Save the below YAML to a file

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: rook-ceph # namespace:cluster
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
aggregationRule:
  clusterRoleSelectors:
  - matchLabels:
      rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules: []
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-rules
  labels:
    rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: rook-ceph # namespace:cluster
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: rook-prometheus
  namespace: rook-ceph # namespace:cluster
  labels:
    prometheus: rook-prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: rook
  ruleSelector:
    matchLabels:
      role: alert-rules
      prometheus: rook-prometheus
  resources:
    requests:
      memory: 400Mi
---
apiVersion: v1
kind: Service
metadata:
  name: rook-prometheus
  namespace: rook-ceph # namespace:cluster
spec:
  type: NodePort
  ports:
  - name: web
    nodePort: 30900
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: rook-prometheus

Click Upload and select the previously saved YAML file
Click Save Changes

Step 3: Update Rook Ceph Operator add-on¶

In this step, you will update the Rook Ceph Operator add-on to enable needed monitoring components of the Rook Ceph Helm chart.

Navigate to Infrastructure -> Add-Ons
Click rook-operator
Click New Version
Enter v2 for the version name
Save the below YAML to a file

monitoring:
  # -- Enable monitoring. Requires Prometheus to be pre-installed.
  # Enabling will also create RBAC rules to allow Operator to create ServiceMonitors
  enabled: true

Click Upload Files and select the previously saved YAML file
Click Save Changes

Step 4: Update Rook Ceph Cluster add-on¶

In this step, you will update the Rook Ceph Cluster add-on to enable needed monitoring components and dashboard of the Rook Ceph Helm chart.

Navigate to Infrastructure -> Add-Ons
Click rook-cluster
Click New Version
Enter v2 for the version name
Click the Edit icon on the side of the existing uploaded values file
Replace the contents of the file with the below YAML

toolbox:
  enabled: true  
monitoring:
  enabled: true
cephClusterSpec:
  dashboard:
    enabled: true

Click Update
Click Save Changes

Step 5: Update Blueprint¶

In this step, you will update the previously created blueprint by adding the newly created and updated add-ons.

Navigate to Infrastructure -> Blueprints
Click the name of the previously created blueprint
Click New Version
Enter v2 for the version name
Click Configure Add-Ons
Click the + symbol to add the prometheus-operator and the prometheus-server add-ons to the blueprint
Add the prometheus-operator as a dependency to the prometheus-server add-on
Select the v2 add-on versions for both the rook-operator and the rook-cluster add-ons
Add the prometheus-server as a dependency to the rook-operator add-on
Click Save Changes

Click Save Changes

Step 6: Apply Blueprint¶

In this step, you will apply the previously updated blueprint to the cluster. Applying the blueprint will install Prometheus and the Ceph monitoring components.

Navigate to Infrastructure -> Clusters
Click the gear icon on your cluster
Select Update Blueprint
Select v2 for the version
Click Save and Publish

Click Exit

Step 7: Retrieve Prometheus Server Endpoint¶

In this step, you will retrieve the Prometheus server endpoint. This endpoint will be used in a cluster override to update the Helm values of the Ceph Cluster add-on. The Ceph dashboard will use this endpoint to retrieve metrics from the Prometheus server.

Execute the following command

echo "http://$(kubectl -n rook-ceph -o jsonpath={.status.hostIP} get pod prometheus-rook-prometheus-0):30900"

Step 8: Create Cluster Override¶

In this step, you will create a cluster override which will assign a cluster specific value to the Ceph Cluster Helm chart. In this case, you will be assigning the Prometheus server endpoint in the cluster to the Ceph Dashboard

Navigate to Infrastructure -> Cluster Overrides
Select New Override
Enter a name for the override
Select Helm for the type
Click Create

Select rook-cluster as the add-on for the resource Selector
Select Specific Clusters for the placement type
Select the cluster where Rook Ceph is installed
Enter the following YAML into the override configuration being sure to update the prometheus server endpoint with the previously obtained endpoint

cephClusterSpec:
  dashboard:
    enabled: true  
    prometheusEndpoint: <UPDATE ENDPOINT>
    prometheusEndpointSSLVerify: false

- Click Save Changed

Step 9: Apply Blueprint¶

In this step, you will apply the blueprint to the cluster again to allow the cluster override to take effect.

Navigate to Infrastructure -> Clusters
Click the gear icon on your cluster
Select Update Blueprint
Select v2 for the version
Click Save and Publish

Click Exit

Step 10: Enable Ceph Orchestrator¶

In this step, you will enable the Ceph Orchestrator to enable additional functionality within the dashboard.

Navigate to Infrastructure -> Clusters
Click Resources on your cluster card
Select Pods in the left hand pane
Select rook-ceph from the namespace dropdown
Enter rook-ceph-tools into the search box
Click the Actions button icon
Select Shell and Logs
Click Exec icon to open a shell into the container
Enter the following commands in the shell to enable the orchestrator

ceph mgr module enable rook
ceph orch set backend rook

You can verify the orchestrator is enabled by running the following command:

ceph orch status

Step 11: Access Ceph Dashboard¶

In this step, you will access the Ceph dashboard using port forwarding for simplicity. It is advised to expose the dashboard via ingress or a load balancer.

Execute the following command to get the decrypted password for the admin account of the dashboard

kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo

Execute the following command to initiate a port-forwarding session to the dashboard

kubectl port-forward service/rook-ceph-mgr-dashboard 28016:8443 -n rook-ceph

In a web browser, navigate to https://localhost:28016/
Enter the username admin
Enter the previously decrypted password

Once you are logged in, you will see the dashboard.

Recap¶

Congratulations! You have successfully enabled monitoring of the Ceph cluster using the Ceph dashboard and Prometheus.