OKD OpenShift 4 Monitoring

June 14, 2023

Page content

In this Post I will show you how you can use the enbeddid Prometheus monitoring system in OpenShift 4 to monitor your workload applications.

Parts of the Openshift 4 series

Part1a: Install Opeshift 4
Part1b: Install Opeshift 4 with calico
Part1c: Install Opeshift 4 with cilium
Part2: Configure OKD OpenShift 4 ingress
Part3: Configure OKD OpenShift 4 authentication
Part4: Configure OKD OpenShift 4 Ceph Persisten Storage
Part5: Configuringure OKD OpenShift 4 registry for bare metal
Part6a: Install Cluster Logging Operator on OpenShift 4
Part6b: Openshift: Log4Shell - Remote Code Execution (CVE-2021-44228) (CVE-2021-4104)
Part7: Understand OKD OpenShift 4 Buildconfig Configurations
Part8: Install RadHat OpenShift pipelines (Tekton) OKD 4

Create Demo app to monitor

First I installed a wordpress by helmfile to use as an app to monitor.

oc new-project monitoring-test

oc project monitoring-test

Deploy test app

nano wordpress-helmfile.yaml
---
helmDefaults:
  createNamespace: false

repositories:
- name: bitnami
  url: https://charts.bitnami.com/bitnami

releases:
- name: wordpress-test
  namespace: monitoring-test
  chart: bitnami/wordpress
  set:
  - name: mariadb.primary.containerSecurityContext.enabled
    value: false
  - name: mariadb.primary.podSecurityContext.enabled
    value: false
  - name: mariadb.auth.rootPassword
    value: wordpress-test
  - name: mariadb.primary.persistence.enabled
    value: true
  - name: mariadb.primary.persistence.size
    value: 20Gi
  - name: wordpressPassword
    value: wordpress-test
  - name: persistence.enabled
    value: false
  - name: containerSecurityContext.enabled
    value: false
  - name: podSecurityContext.enabled
    value: false

helmfile apply -f wordpress-helmfile.yaml

Enable user namespace monitoring

By default the enbeddid prometheus only monitor the cluster components. To monitor applications in user namespace you need to enable the user namespace monitoring.

oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: true
    alertmanagerMain:
      enableUserAlertmanagerConfig: true
EOF

This will create a separate prometheus stack for user namespace monitoring. List user namespace monitor status:

oc get pods -n openshift-user-workload-monitoring
NAME                                  READY   STATUS    RESTARTS   AGE
prometheus-operator-7c55995fb-zsjxk   2/2     Running   0          4m
prometheus-user-workload-0            6/6     Running   0          4m
prometheus-user-workload-1            6/6     Running   0          4m
thanos-ruler-user-workload-0          3/3     Running   0          4m
thanos-ruler-user-workload-1          3/3     Running   0          4m

Now we can create alert routing to send emails about the alerts.

oc apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1beta1
kind: AlertmanagerConfig
metadata:
metadata:
  name: pod-num-alert
  namespace: monitoring-test
spec:
  receivers:
    - name: default
    - name: email_alert
      emailConfigs:
      - to: devopstales@
        from: prometheus@mydomain.intra
        smarthost: mail.mydomain.intra:25
        requireTLS: false
        sendResolved: true
  route:
    receiver: email_alert
    groupInterval: 5m
    groupWait: 30s
    repeatInterval: 10m
    groupBy:
      - namespace
    routes:
      - match:
          severity: wordpress
        receiver: email_alert
EOF

Create test alert:

oc apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: pod-num-alert
  namespace: monitoring-test
spec:
  groups:
  - name: monitoring-test
    rules:
    - alert: RunningPodNumAlert
      expr: sum(kube_pod_status_ready{namespace="monitoring-test"}) != 3
      for: 5m
      labels:
        namespace: monitoring-test
        severity: wordpress
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: pod-memory-alert
  namespace: monitoring-test
spec:
  groups:
  - name: monitoring-test
    rules:
    - alert: RunningPodMemoryAlert
      expr: ((( sum(container_memory_working_set_bytes{image!="",container!="POD", namespace="monitoring-test"}) by (namespace,container,pod)))) / 1000000 > 90
      for: 5m
      labels:
        namespace: monitoring-test
        severity: wordpress
EOF

Parts of the Openshift 4 series

Create Demo app to monitor

Enable user namespace monitoring

Your support is our everlasting motivation, that cup of coffee is what keeps us going!

Your support is our everlasting motivation,
that cup of coffee is what keeps us going!