Skip to content

Monitor

Monitor Ceph Cluster

In this part, you will setup and access the Ceph Dashboard to monitor and manage the Ceph Cluster.


Step 1: Install Prometheus Operator

In order for the Ceph Dashboard to be used to it's full potential and display cluster metrics, Prometheus is needed to pass those metrics to the dashboard.

In this step, you will install the Prometheus Operator into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.

  • Navigate to Infrastructure -> Add-Ons
  • Click New Add-On -> Create New Add-On
  • Enter the name prometheus-operator
  • Select K8s YAML for the type
  • Select Upload files manually
  • Select the rook-ceph namespace
  • Click Create

Create Add-On

  • Click New Version
  • Enter v1 for the version name
  • Save the YAML file located HERE to your local machine
  • Click Upload and select the previously saved YAML file
  • Click Save Changes

Step 2: Install Prometheus Server

In this step, you will install a Prometheus Server into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.

  • Navigate to Infrastructure -> Add-Ons
  • Click New Add-On -> Create New Add-On
  • Enter the name prometheus-server
  • Select K8s YAML for the type
  • Select Upload files manually
  • Select the rook-ceph namespace
  • Click Create

Create Add-On

  • Click New Version
  • Enter v1 for the version name
  • Save the below YAML to a file
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: rook-ceph # namespace:cluster
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
aggregationRule:
  clusterRoleSelectors:
  - matchLabels:
      rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules: []
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-rules
  labels:
    rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules:
- apiGroups: [""]
  resources:
  - nodes
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: rook-ceph # namespace:cluster
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: rook-prometheus
  namespace: rook-ceph # namespace:cluster
  labels:
    prometheus: rook-prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: rook
  ruleSelector:
    matchLabels:
      role: alert-rules
      prometheus: rook-prometheus
  resources:
    requests:
      memory: 400Mi
---
apiVersion: v1
kind: Service
metadata:
  name: rook-prometheus
  namespace: rook-ceph # namespace:cluster
spec:
  type: NodePort
  ports:
  - name: web
    nodePort: 30900
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: rook-prometheus
  • Click Upload and select the previously saved YAML file
  • Click Save Changes

Step 3: Update Rook Ceph Operator add-on

In this step, you will update the Rook Ceph Operator add-on to enable needed monitoring components of the Rook Ceph Helm chart.

  • Navigate to Infrastructure -> Add-Ons
  • Click rook-operator
  • Click New Version
  • Enter v2 for the version name
  • Save the below YAML to a file
monitoring:
  # -- Enable monitoring. Requires Prometheus to be pre-installed.
  # Enabling will also create RBAC rules to allow Operator to create ServiceMonitors
  enabled: true
  • Click Upload Files and select the previously saved YAML file
  • Click Save Changes

Create Add-On


Step 4: Update Rook Ceph Cluster add-on

In this step, you will update the Rook Ceph Cluster add-on to enable needed monitoring components and dashboard of the Rook Ceph Helm chart.

  • Navigate to Infrastructure -> Add-Ons
  • Click rook-cluster
  • Click New Version
  • Enter v2 for the version name
  • Click the Edit icon on the side of the existing uploaded values file
  • Replace the contents of the file with the below YAML
toolbox:
  enabled: true  
monitoring:
  enabled: true
cephClusterSpec:
  dashboard:
    enabled: true 
  • Click Update
  • Click Save Changes

Create Add-On


Step 5: Update Blueprint

In this step, you will update the previously created blueprint by adding the newly created and updated add-ons.

  • Navigate to Infrastructure -> Blueprints
  • Click the name of the previously created blueprint
  • Click New Version
  • Enter v2 for the version name
  • Click Configure Add-Ons
  • Click the + symbol to add the prometheus-operator and the prometheus-server add-ons to the blueprint
  • Add the prometheus-operator as a dependency to the prometheus-server add-on
  • Select the v2 add-on versions for both the rook-operator and the rook-cluster add-ons
  • Add the prometheus-server as a dependency to the rook-operator add-on
  • Click Save Changes

Update Blueprint

  • Click Save Changes

Step 6: Apply Blueprint

In this step, you will apply the previously updated blueprint to the cluster. Applying the blueprint will install Prometheus and the Ceph monitoring components.

  • Navigate to Infrastructure -> Clusters
  • Click the gear icon on your cluster
  • Select Update Blueprint
  • Select v2 for the version
  • Click Save and Publish

Apply Blueprint

  • Click Exit

Step 7: Retrieve Prometheus Server Endpoint

In this step, you will retrieve the Prometheus server endpoint. This endpoint will be used in a cluster override to update the Helm values of the Ceph Cluster add-on. The Ceph dashboard will use this endpoint to retrieve metrics from the Prometheus server.

  • Execute the following command
echo "http://$(kubectl -n rook-ceph -o jsonpath={.status.hostIP} get pod prometheus-rook-prometheus-0):30900"

Retrieve Endpoint


Step 8: Create Cluster Override

In this step, you will create a cluster override which will assign a cluster specific value to the Ceph Cluster Helm chart. In this case, you will be assigning the Prometheus server endpoint in the cluster to the Ceph Dashboard

  • Navigate to Infrastructure -> Cluster Overrides
  • Select New Override
  • Enter a name for the override
  • Select Helm for the type
  • Click Create

Create Override

  • Select rook-cluster as the add-on for the resource Selector
  • Select Specific Clusters for the placement type
  • Select the cluster where Rook Ceph is installed
  • Enter the following YAML into the override configuration being sure to update the prometheus server endpoint with the previously obtained endpoint

cephClusterSpec:
  dashboard:
    enabled: true  
    prometheusEndpoint: <UPDATE ENDPOINT>
    prometheusEndpointSSLVerify: false
- Click Save Changed

Create Override


Step 9: Apply Blueprint

In this step, you will apply the blueprint to the cluster again to allow the cluster override to take effect.

  • Navigate to Infrastructure -> Clusters
  • Click the gear icon on your cluster
  • Select Update Blueprint
  • Select v2 for the version
  • Click Save and Publish

Apply Blueprint

  • Click Exit

Step 10: Enable Ceph Orchestrator

In this step, you will enable the Ceph Orchestrator to enable additional functionality within the dashboard.

  • Navigate to Infrastructure -> Clusters
  • Click Resources on your cluster card
  • Select Pods in the left hand pane
  • Select rook-ceph from the namespace dropdown
  • Enter rook-ceph-tools into the search box
  • Click the Actions button icon
  • Select Shell and Logs
  • Click Exec icon to open a shell into the container
  • Enter the following commands in the shell to enable the orchestrator
ceph mgr module enable rook
ceph orch set backend rook

You can verify the orchestrator is enabled by running the following command:

ceph orch status

Step 11: Access Ceph Dashboard

In this step, you will access the Ceph dashboard using port forwarding for simplicity. It is advised to expose the dashboard via ingress or a load balancer.

  • Execute the following command to get the decrypted password for the admin account of the dashboard
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
  • Execute the following command to initiate a port-forwarding session to the dashboard
kubectl port-forward service/rook-ceph-mgr-dashboard 28016:8443 -n rook-ceph
  • In a web browser, navigate to https://localhost:28016/
  • Enter the username admin
  • Enter the previously decrypted password

Once you are logged in, you will see the dashboard.

Dashboard


Recap

Congratulations! You have successfully enabled monitoring of the Ceph cluster using the Ceph dashboard and Prometheus.