Running Wild Container Image on ICP 3.1.1 — Security and Enforcement

Image for post
Image for post

With the enterprise-grade Kubernetes’ release, IBM Cloud Private 3.1.1, running a docker image from any external resources has been tightly controlled.

By default, when a new Kubernetes object is deployed, ICP will validate the container images to make sure only those in the predefined whitelist are able to run. If not the container will be rejected to be scheduled. The error message is reflected in the output of kubectl/helm command line tool. For an example,

$ kubectl run -it --rm debug --image=busybox --restart=Never -- sh
Error from server (InternalError): Internal error occurred: admission webhook "trust.hooks.securityenforcement.admission.cloud.ibm.com" denied the request:
Deny "docker.io/busybox", no matching repositories in the ImagePolicies

You can define the whitelist in the ICP console or define it by using the kubectl command line with a sample YAML file as below,

apiVersion: securityenforcement.admission.cloud.ibm.com/v1beta1
kind: ClusterImagePolicy
metadata:
name: my-cluster-images-whitelist
spec:
repositories:
- name: docker.io/zhiminwen/*
- name: quay.io/kubernetes-ingress-controller/*

You use wildcard * to include the image repo in the whitelist as shown in the above list. By default, after the installation, only the ICP private registry images, IBM images from Bluemix registry and some of the docker hub images that are used by ICP are predefined in the whitelist.

Other than ClusterImagePolicy you can also define ImagePolicy which applies to the namespace scope. Its noticed once you define an ImagePolicy at the namespace level, all the whitelists of ClusterImagePolicy are ignored. You have to define all your own image whitelist in the namespace level ImagePolicy. (The error message shown above is actually when I defined an empty ImagePolicy at the default namespace)

There are more parameters for the ClusterImagePolicy/ImagePolicy, such as policy and Vulnerability Advisor settings. Refer to here for more details.

Before move on, let’s do a simple test to examine a container image running error.

Create a new namespace as exp with kubectl create ns exp followed by creating a simple deployment as below,

apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: busybox
command: ["sh", "-c", "sleep 3600"]

Apply it. Success. Everything seems fine. However,

$ kubectl -n exp get pods
NAME READY STATUS RESTARTS AGE
myapp-b65ff6d5d-7pq47 0/1 CreateContainerConfigError 0 6s

CreateContainerConfigError ?! Let's find out more by running a describe,

$ kubectl -n exp describe pods myapp-b65ff6d5d-7pq47
Name: myapp-b65ff6d5d-7pq47
Namespace: exp
Priority: 0
PriorityClassName: <none>
Node: dev-worker2/192.168.64.248
Start Time: Sun, 09 Dec 2018 14:14:57 +0800
Labels: app=myapp
pod-template-hash=621992818
Annotations: container.apparmor.security.beta.kubernetes.io/myapp=runtime/default
kubernetes.io/psp=ibm-restricted-psp
seccomp.security.alpha.kubernetes.io/pod=docker/default
Status: Pending
IP: 10.1.184.101
Controlled By: ReplicaSet/myapp-b65ff6d5d
Containers:
myapp:
Container ID:
Image: busybox
Image ID:
Port: <none>
Host Port: <none>
Command:
sh
-c
sleep 3600
State: Waiting
Reason: CreateContainerConfigError
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-h558l (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-h558l:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-h558l
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned exp/myapp-b65ff6d5d-7pq47 to dev-worker2
Normal SandboxChanged 56s kubelet, dev-worker2 Pod sandbox changed, it will be killed and re-created.
Normal Pulling 15s (x5 over 1m) kubelet, dev-worker2 pulling image "busybox"
Normal Pulled 11s (x5 over 57s) kubelet, dev-worker2 Successfully pulled image "busybox"
Warning Failed 11s (x5 over 57s) kubelet, dev-worker2 Error: container has runAsNonRoot and image will run as root

So the pod is not able to run as root. This is related to the Pod Security Policy (PSP).

In the latest ICP 3.1.1, default Pod Security Policy is turned on as “restricted”.

$ cloudctl cm psp-default-get
Default PSP: restricted

If we list the existing PSPs, we have

$ kubectl get psp -o name
podsecuritypolicy.extensions/ibm-anyuid-hostaccess-psp
podsecuritypolicy.extensions/ibm-anyuid-hostpath-psp
podsecuritypolicy.extensions/ibm-anyuid-psp
podsecuritypolicy.extensions/ibm-privileged-psp
podsecuritypolicy.extensions/ibm-restricted-psp

When the ICP’s PSP is set as restricted, the PSP of ibm-restricted-psp is applied. List this specific PSP, we got

$ kubectl get psp ibm-restricted-psp
NAME PRIV CAPS SELINUX RUNASUSER FSGROUP SUPGROUP READONLYROOTFS VOLUMES
ibm-restricted-psp false RunAsAny MustRunAsNonRoot MustRunAs MustRunAs false configMap,emptyDir,projected,secret,downwardAPI,persistentVolumeClaim

Notice that RUNASUSER is set as MustRunAsNonRoot. That's the reason why the busybox pod failed to run.

But how exactly, the PSP is applied to the Pod?

Image for post
Image for post

In the latest ICP, a ClusteRole ibm-restricted-clusterroleis defined. It uses the PSP ibm-restricted-psp The detail YAML of it is listed as below (the annotation field is deleted to save the space)

$ kubectl get clusterrole ibm-restricted-clusterrole -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations: <removed to save space>
name: ibm-restricted-clusterrole
resourceVersion: "272"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/ibm-restricted-clusterrole
uid: 9c8a8337-f2d9-11e8-959b-005056b59a82
rules:
- apiGroups:
- extensions
resourceNames:
- ibm-restricted-psp
resources:
- podsecuritypolicies
verbs:
- use

Next, ICP has a ClusterRoleBinding named as ibm-restricted-psp-users It binds the service accounts to the clusterrole ibm-restricted-clusterrole

$ kubectl get clusterrolebindings ibm-restricted-psp-users -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations: <removed to save space>

name: ibm-restricted-psp-users
resourceVersion: "426"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/ibm-restricted-psp-users
uid: c3a0e45f-f2d9-11e8-959b-005056b59a82
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ibm-restricted-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:authenticated
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:unauthenticated
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts

Now it’s clear why the busybox container cannot be run. When the deployment is deployed to the namespace exp, the default service account is used, in which the cluster role of ibm-restricted-clusterrolis bound, and the associated PSP of ibm-restricted-pspblocks the root user execution in the container.

Understanding the reason, the fix is easy.

  1. Bind the default service account in that namespace with the clusterrole that can run root containers, say ‘ibm-anyuid-clusterrole’.

Apply the following yaml file for ClusterRoleBinding.

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: exp-sa-anyuid-binding
roleRef:
kind: ClusterRole
name: ibm-anyuid-clusterrole
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: default
namespace: exp

Delete the old pods, then you will see the pod is running,

$ kubectl -n exp get pods
NAME READY STATUS RESTARTS AGE
myapp-b65ff6d5d-8qzhd 1/1 Running 0 7s

2. Instead of patching the default account, create a new service account and bind the clusterrole of ibm-anyuid-clusterroleto it. In the deployment, use the new service account to run the container.

This approach is more specific to the Deployment. It will not lose the security control on the other pods in the same namespace.

3. Fundamentally fix the root user usage in the Dockerfile.

This is the most secure approach. But sometimes you may not have control over the container images.

As ICP has turned on the tighter control on the container images, if there is a business need to run some external containers, careful planning is required. You will require to

  • Add the image repo to the ImagePolicy whitelist.
  • Examine the PSP required. Create role/clusterrole if required.
  • Create rolebinding/clusterrolebinding to bind the proper role/clusterrole to the service account to allow the container to be executed.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store