Default Toleration at Namespace Level

I have an OpenShift cluster where I dedicate some of the nodes to run my workload by tainting these nodes. To run my normal pods on these nodes, I just need to define the tolerations based on the taint keys.

However, the workload is operator-based, and too bad not all the CRD has the tolerations defined. The “brute-force change” on the Deployment or the Statefulset will not take effect in the end as “the big brother” will rectify it based on the definition in its original mind ;)

I need the missing toleration to be inserted after the operator creates the K8s resource and before the resource persists and submits for scheduling. This is what the admission controllers, particularly PodTolerationRestriction can help.

The admission controller PodTolerationRestriction will check if the Pod tolerations conflict with the predefined whitelist, and it is able to define default tolerations at the namespace level with annotation. If the pod doesn’t have the toleration then this default toleration will be applied.

Let’s check it out.

Taint all the worker nodes non-schedulable.

kubectl taint nodes {{ .node }} reservedFor=myApp:NoSchedule

Create a namespace “tolerant”. Deploy a test app in this namespace as shown below,

apiVersion: apps/v1
kind: Deployment
name: app
namespace: tolerant
app: app
replicas: 1
app: app
app: app
- name: app
image: alpine
- sh
- -c
- while true; do sleep 10; done

After applying it, the pod is in pending mode. Describe it, we see the following,

Warning  FailedScheduling  10s (x3 over 27s)  default-scheduler  0/4 nodes are available: 1 node(s) had taint { }, that the pod didn't tolerate, 3 node(s) had taint {reservedFor: myApp}, that the pod didn't tolerate.

In my four nodes cluster, the master node is not schedulable, the 3 workers are also not schedulable as the deployment is lacking the toleration for the reservedFor taint key.

Now, let’s annotate the namespace with the default tolerations.

oc annotate namespace tolerant ''='[{"operator": "Exists", "effect": "NoSchedule", "key": "reservedFor"}]'

Delete the pending pod, and watch the pod is scheduled and running. Describe it again and check the default toleration was added by the admission controller automatically,

Tolerations: for 300s for 300s

The problem of lacking toleration in the operator-based CRD is resolved.

It’s noticed that not all the admission controllers are turned on by default for the standard Kubernetes. In my 4.x OpenShift cluster, the enabled admission controller can be referred to the default YAML file.

Cloud explorer