Monitor
Monitor Ceph Cluster¶
In this part, you will setup and access the Ceph Dashboard to monitor and manage the Ceph Cluster.
Step 1: Install Prometheus Operator¶
In order for the Ceph Dashboard to be used to it's full potential and display cluster metrics, Prometheus is needed to pass those metrics to the dashboard.
In this step, you will install the Prometheus Operator into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.
- Navigate to Infrastructure -> Add-Ons
- Click New Add-On -> Create New Add-On
- Enter the name prometheus-operator
- Select K8s YAML for the type
- Select Upload files manually
- Select the rook-ceph namespace
- Click Create
- Click New Version
- Enter v1 for the version name
- Save the YAML file located HERE to your local machine
- Click Upload and select the previously saved YAML file
- Click Save Changes
Step 2: Install Prometheus Server¶
In this step, you will install a Prometheus Server into the Kubernetes cluster as a custom cluster add-on. The add-on will then be added to the existing custom cluster blueprint.
- Navigate to Infrastructure -> Add-Ons
- Click New Add-On -> Create New Add-On
- Enter the name prometheus-server
- Select K8s YAML for the type
- Select Upload files manually
- Select the rook-ceph namespace
- Click Create
- Click New Version
- Enter v1 for the version name
- Save the below YAML to a file
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
namespace: rook-ceph # namespace:cluster
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules: []
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus-rules
labels:
rbac.ceph.rook.io/aggregate-to-prometheus: "true"
rules:
- apiGroups: [""]
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: rook-ceph # namespace:cluster
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: rook-prometheus
namespace: rook-ceph # namespace:cluster
labels:
prometheus: rook-prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: rook
ruleSelector:
matchLabels:
role: alert-rules
prometheus: rook-prometheus
resources:
requests:
memory: 400Mi
---
apiVersion: v1
kind: Service
metadata:
name: rook-prometheus
namespace: rook-ceph # namespace:cluster
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: rook-prometheus
- Click Upload and select the previously saved YAML file
- Click Save Changes
Step 3: Update Rook Ceph Operator add-on¶
In this step, you will update the Rook Ceph Operator add-on to enable needed monitoring components of the Rook Ceph Helm chart.
- Navigate to Infrastructure -> Add-Ons
- Click rook-operator
- Click New Version
- Enter v2 for the version name
- Save the below YAML to a file
monitoring:
# -- Enable monitoring. Requires Prometheus to be pre-installed.
# Enabling will also create RBAC rules to allow Operator to create ServiceMonitors
enabled: true
- Click Upload Files and select the previously saved YAML file
- Click Save Changes
Step 4: Update Rook Ceph Cluster add-on¶
In this step, you will update the Rook Ceph Cluster add-on to enable needed monitoring components and dashboard of the Rook Ceph Helm chart.
- Navigate to Infrastructure -> Add-Ons
- Click rook-cluster
- Click New Version
- Enter v2 for the version name
- Click the Edit icon on the side of the existing uploaded values file
- Replace the contents of the file with the below YAML
toolbox:
enabled: true
monitoring:
enabled: true
cephClusterSpec:
dashboard:
enabled: true
- Click Update
- Click Save Changes
Step 5: Update Blueprint¶
In this step, you will update the previously created blueprint by adding the newly created and updated add-ons.
- Navigate to Infrastructure -> Blueprints
- Click the name of the previously created blueprint
- Click New Version
- Enter v2 for the version name
- Click Configure Add-Ons
- Click the + symbol to add the prometheus-operator and the prometheus-server add-ons to the blueprint
- Add the prometheus-operator as a dependency to the prometheus-server add-on
- Select the v2 add-on versions for both the rook-operator and the rook-cluster add-ons
- Add the prometheus-server as a dependency to the rook-operator add-on
- Click Save Changes
- Click Save Changes
Step 6: Apply Blueprint¶
In this step, you will apply the previously updated blueprint to the cluster. Applying the blueprint will install Prometheus and the Ceph monitoring components.
- Navigate to Infrastructure -> Clusters
- Click the gear icon on your cluster
- Select Update Blueprint
- Select v2 for the version
- Click Save and Publish
- Click Exit
Step 7: Retrieve Prometheus Server Endpoint¶
In this step, you will retrieve the Prometheus server endpoint. This endpoint will be used in a cluster override to update the Helm values of the Ceph Cluster add-on. The Ceph dashboard will use this endpoint to retrieve metrics from the Prometheus server.
- Execute the following command
echo "http://$(kubectl -n rook-ceph -o jsonpath={.status.hostIP} get pod prometheus-rook-prometheus-0):30900"
Step 8: Create Cluster Override¶
In this step, you will create a cluster override which will assign a cluster specific value to the Ceph Cluster Helm chart. In this case, you will be assigning the Prometheus server endpoint in the cluster to the Ceph Dashboard
- Navigate to Infrastructure -> Cluster Overrides
- Select New Override
- Enter a name for the override
- Select Helm for the type
- Click Create
- Select rook-cluster as the add-on for the resource Selector
- Select Specific Clusters for the placement type
- Select the cluster where Rook Ceph is installed
- Enter the following YAML into the override configuration being sure to update the prometheus server endpoint with the previously obtained endpoint
cephClusterSpec:
dashboard:
enabled: true
prometheusEndpoint: <UPDATE ENDPOINT>
prometheusEndpointSSLVerify: false
Step 9: Apply Blueprint¶
In this step, you will apply the blueprint to the cluster again to allow the cluster override to take effect.
- Navigate to Infrastructure -> Clusters
- Click the gear icon on your cluster
- Select Update Blueprint
- Select v2 for the version
- Click Save and Publish
- Click Exit
Step 10: Enable Ceph Orchestrator¶
In this step, you will enable the Ceph Orchestrator to enable additional functionality within the dashboard.
- Navigate to Infrastructure -> Clusters
- Click Resources on your cluster card
- Select Pods in the left hand pane
- Select rook-ceph from the namespace dropdown
- Enter rook-ceph-tools into the search box
- Click the Actions button icon
- Select Shell and Logs
- Click Exec icon to open a shell into the container
- Enter the following commands in the shell to enable the orchestrator
ceph mgr module enable rook
ceph orch set backend rook
You can verify the orchestrator is enabled by running the following command:
ceph orch status
Step 11: Access Ceph Dashboard¶
In this step, you will access the Ceph dashboard using port forwarding for simplicity. It is advised to expose the dashboard via ingress or a load balancer.
- Execute the following command to get the decrypted password for the admin account of the dashboard
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
- Execute the following command to initiate a port-forwarding session to the dashboard
kubectl port-forward service/rook-ceph-mgr-dashboard 28016:8443 -n rook-ceph
- In a web browser, navigate to https://localhost:28016/
- Enter the username admin
- Enter the previously decrypted password
Once you are logged in, you will see the dashboard.
Recap¶
Congratulations! You have successfully enabled monitoring of the Ceph cluster using the Ceph dashboard and Prometheus.