Create alerts for OpenShift user workload

Starting from OpenShift 4.6, user workload monitoring is formally supported by introducing a second Prometheus operator instance in a new namespace called openshift-user-workload-monitoring. This paper demonstrates how a user workload can be monitored and the alerts can be created.

Turn on user workload monitoring

apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
enableUserWorkload: true

With the enableUserWorkload key set as true, the 2nd Prometheus operator will be installed, which will create a Prometheus and Thanos Ruler as shown as below,

oc -n openshift-user-workload-monitoring get pods
NAME READY STATUS RESTARTS AGE
prometheus-operator-7bd67b9d5d-znr8r 2/2 Running 0 19h
prometheus-user-workload-0 5/5 Running 1 19h
prometheus-user-workload-1 5/5 Running 1 19h
thanos-ruler-user-workload-0 3/3 Running 0 19h
thanos-ruler-user-workload-1 3/3 Running 0 19h

The monitoring architecture is depicted in the following diagram.

OpenShift Monitoring

For a user workload, if it has Prometheus compatible monitoring capability, the user Prometheus is able to scrape the application’s metric. By using Thanos Querier, user workload metrics can be queried with the main monitoring Prometheus also.

Notice the user Prometheus will share the main Alertmanager for alerts. If the metric expression for alerts can be evaluated within the user Prometheus, alert can be generated and forward to the main alertmanager. On the other hand, a Thanos Ruler is introduced in the namespace. It will be able to evaluate both cluster level and user Prometheus metrics expressions. It is multitenancy aware, the metric will be the metric of the specific user’s project only.

Let’s check it out.

A sample app exposes Prometheus metric

Compile the code, build the image and push it into the OpenShift private registry in the namespace of app-mon

Create a PVC to allocate 1M storage. We will use the free space metrics from the cluster-level Prometheus to construct the alerts for our user workload.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-pvc
namespace: app-mon
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Mi

Deploy it with the following deployment resource,

apiVersion: apps/v1
kind: Deployment
metadata:
name: app
namespace: app-mon
labels:
app: app
spec:
replicas: 1
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: image-registry.openshift-image-registry.svc:5000/app-mon/app:1.0
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: data-pvc

Expose the service,

apiVersion: v1
kind: Service
metadata:
name: svc-app
namespace: app-mon
labels:
app: app

spec:
selector:
app: app
ports:
- name: http
protocol: TCP
port: 8080
targetPort: 8080

Notice we label the service so that the Prometheus operator can select this service to scrape the metrics. Create the following CRD,

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: svcmon-app
namespace: app-mon
spec:
selector:
matchLabels:
app: app
endpoints:
- port: http

Monitor the metrics

Validate the service is discovered and active, and the metrics are collected.

Meantime, the metrics can be accessed from the Thanos querier at the cluster level.

Create an alert with user metrics only

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: my-app-alert-rules
namespace: app-mon
labels:
openshift.io/prometheus-rule-evaluation-scope: leaf-prometheus
spec:
groups:
- name: app-mon
rules:
- alert: too-many-client-connected
annotations:
description: 'client connected now: {{ $value }}'
summary: too many connections
expr: myapp_client_connected > 1
for: 1m
labels:
severity: warning

As the metrics are within the user Prometheus, we can sink the rule to the user Prometheus by using the label of openshift.io/prometheus-rule-evaluation-scope: leaf-prometheus.

Check the user Prometheus, you can see the rule is created.

Pumping in some load, monitor the alert is created,

In the meantime, on the OpenShift Web Console, select the Source = User filter, the alert is shown here also.

The cluster-level notification can then be performed.

Create the same alert evaluated by Thano Ruler

You will notice that the rule disappears from user Prometheus. Open the Thanos Ruler, the rule will be evaluated here instead.

Notice the namespace label is added automatically so that the rule can only see the data in its own namespace.

Create user alert with cluster metrics

Create the following rule to monitor the PVC volume usage.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: my-infra-alert-rules
namespace: app-mon
spec:
groups:
- name: infra-mon
rules:
- alert: pvc-volume-full
annotations:
description: '{{ printf "%.2f" $value }}% full for PVC {{ $labels.persistentvolumeclaim }}.'
summary: PVC volume is full
expr: kubelet_volume_stats_used_bytes/kubelet_volume_stats_capacity_bytes*100 > 0.01
for: 1m
labels:
severity: warning

As we set the threshold very low, the alert is created,

Tips: You can watch the PrometheusRule resource is picked up and the Rule is created either in the configMap prometheus-user-workload-rulefiles-0 or thanos-ruler-user-workload-rulefiles-0 in the project of openshift-user-workload-monitoring.

Conclusion