landscape

landscape

Deep think, Easy go.

04 Aug 2020

Pod Security Policy Best Practice for multi-tenant cluster

Update Apr 7 2021

Pod Security Policy is depricated in Kubernetes v1.21 and will be removed v1.25.

Now, the next of PSP is considered. In the future, external open source tools such as Open Policy Agent will be recommended for complex controls. And PSP Replacement Policy (temporary name) will be implemented in Kubernetes as a simpler and clearer security feature.

See the details at the blog:

https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/

Pod Security Policy

Pod Security Policy (PSP) is a feature of Kubernetes that enable security policies for Pods in an entire Kubernetes cluster, and allows you to control the availability of Pods based on their security level.

While Pod Security Policy enables effective and powerful policy control in multi-tenant clusters, I think it’s a tough job to enforce strict policy control as you expected.

For example, there are some considerations when you apply it.

It’s relatively easy to configure like that: for User A to create a privileged pod, but not for User B.

But it is not enough for most usecase. The policies should effect not only Users but also the Kubernetes Controllers that actually create Pods.

Who create Pods ?

In usual situations, Users don’t create Pod resources directly. They often create a Deployment resource to run Pods.

When you create a Deployment, it is the ReplicaSet Controller that actually creates the Pods.

In this case, it’s not enough to enable a policy for the User who created the Deployment; if the policy is not properly applied to the ReplicaSet Controller as well, the User will be able to run a privileged pod indirectly.

If you are using other controllers such as DaemonSet Controller or custom Kubernetes Operator to create pods, they must be configured properly as well.

Configure PSP for Controllers

If a ServiceAccount is specified in a Pod template spec in Deployment, the PSP for the ServiceAccount will be effected. It is same as DaemonSet and StatefulSet.

On the other hand, when ServiceAccount is NOT specified in the PodSpec, ServiceAccount named default in the Namespace where the Pod is launched will be used. default ServiceAccounts exist in every Namespaces by default.

So you need to configure that the appropriate PSPs is enabled for the ServiceAccount.

You can prevent Deployment which has privileged Pod specification by implementing validation webhooks on your own, or using an Open Policy Agent, etc.

However, these are only static validation for the specs. The powerful PSP’s dynamic validations, such as blocking run as root user in container when kubelet starts the container, are difficult to implement on other than PSP.

Priority of PSP

There are a few quirks and caveats about the policy priority when multiple PSPs are enabled.

The official documentation states:

PodSecurityPolicies which allow the pod as-is, without changing defaults or mutating the pod, are preferred. PodSecurityPolicies doesn’t matter.

If the pod must be defaulted or mutated, the first PSP (ordered by name) to allow the pod is selected.

AWS IAM Policy, for example, overrides the Deny policy and the Deny has a first priority. But in the case of PSPs, the policy that allows as-is to takes precedence.

Therefore, I think the first approach is to apply the Deny policy to a large scope in cluster. And next, apply the policy which allow some privileged feature only to the objects that require privileges.

It’s hard to understand of the permissions when multiple PSP is matched. Therefore it’s better to give up on combining policies to create complex privilege controls.

My Best Practice

Based on the above, I recommend the following ideas:

  • First of all, create two PSPs. one that allows privileges and one that is restricted.
  • Next apply the restricted PSP to the entire scope of the cluster like system:authenticated Group.
  • Then select privileged Namespaces and apply the privileged PSP to the namespaces.
  • However in kube-system namespace, you need to attach the privileded PSP to the each ServiceAccounts.
  • If there is some applications that cannot run by the restricted PSP due to its behavior or characteristics, add another PSP and attach it to the namespace the apps running.
  • You should use “psp-util”, a kubectl plugin to manipulate PSPs (Important 😊 ).

Step by step to configure Pod Security Policy Best Practice

Here’s an example of how to set it up as described in My Best Practice.

1. Install the kubectl plugin “psp-util”

Pod Security Policy is defined as a kubernetes resource PodSecurityPolicy, which is bound to Group or ServiceAccount using ClusterRole and ClusterRoleBinding.

A Kubectl plugin called β€œpsp-util” is available πŸš€ on Krew, an official plugin manager of Kubectl.

It helps you to manage Pod Security Policy with the associated RBAC resources.

You can view the relations between policies and objects in your cluster, and easily attach and detach the policies.

After installing Krew, you can install psp-util by the following command

kubectl krew install psp-util

2. See the current settings

I will proceed with EKS as an example, since the environment I usually touch is EKS.

First, check current PSPs in cluster.

$ kubectl get psp
NAME             PRIV    CAPS   SELINUX    RUNASUSER          FSGROUP     SUPGROUP    READONLYROOTFS   VOLUMES
eks.privileged   true    *      RunAsAny   RunAsAny           RunAsAny    RunAsAny    false            *

EKS has a PSP named eks.privileged by default. This is a policy that allows all Pods to be run. https://docs.aws.amazon.com/ja_jp/eks/latest/userguide/pod-security-policy.html

Next, check the scope the policy effects.

Run the psp-util tree command. You will see that the PSP is applied to system:authenticated Group.

$ kubectl psp-util tree
πŸ“™ PSP eks.privileged
└── πŸ“• ClusterRole eks:podsecuritypolicy:privileged
    └── πŸ“˜ ClusterRoleBinding eks:podsecuritypolicy:authenticated
        └── πŸ“— Subject{Kind: Group, Name: system:authenticated, Namespace:}

system:authenticated Group is the group that all access authenticated by API Server belongs. It means anyone in the cluster is covered by this policy.

3. Create PSP

3-1. Create a privileged PSP

As mentioned above, EKS have a privileged PSP by default.

Here is all permitted PSP configuration in EKS:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  annotations:
    kubernetes.io/description: privileged allows full unrestricted access to pod features,
      as if the PodSecurityPolicy controller was not enabled.
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
  labels:
    eks.amazonaws.com/component: pod-security-policy
    kubernetes.io/cluster-service: "true"
  name: eks.privileged
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - '*'
  fsGroup:
    rule: RunAsAny
  hostIPC: true
  hostNetwork: true
  hostPID: true
  hostPorts:
  - max: 65535
    min: 0
  privileged: true
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - '*'

If a privileged PSP does not exist in your cluster, create and apply the above configuration file.

kubectl apply -f psp-privileged.yaml

It’s also ok to create another PSP that suits your environment if you want a more restricted PSP for the privileged namespace and pods.

3-2. Create a restricted PSP

As a restricted PSP, it’s the best to create the PSP in the EKS Best Practices Recommendations published by AWS.

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
    name: restricted
    annotations:
        seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default' # Remove if seccomp is not enabled
        apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default' # Remove if apparmor is not enabled
        seccomp.security.alpha.kubernetes.io/defaultProfileName:  'runtime/default' # Remove if seccomp is not enabled
        apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default' # Remove if apparmor is not enabled
spec:
    privileged: false
    # Required to prevent escalations to root.
    allowPrivilegeEscalation: false
    # This is redundant with non-root + disallow privilege escalation,
    # but we can provide it for defense in depth.
    requiredDropCapabilities:
    - ALL
    # Allow core volume types.
    volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    # Assume that persistentVolumes set up by the cluster admin are safe to use.
    - 'persistentVolumeClaim'
    hostNetwork: false
    hostIPC: false
    hostPID: false
    runAsUser:
        # Require the container to run without root privileges.
        rule: 'MustRunAsNonRoot'
    seLinux:
        # This policy assumes the nodes are using AppArmor rather than SELinux.
        rule: 'RunAsAny'
    supplementalGroups:
        rule: 'MustRunAs'
        ranges:
        # Forbid adding the root group.
        - min: 1
          max: 65535
    fsGroup:
        rule: 'MustRunAs'
        ranges:
        # Forbid adding the root group.
        - min: 1
          max: 65535
    readOnlyRootFilesystem: false

Create an above configuration file and apply it.

kubectl apply -f psp-restricted.yaml

If your application is already running in a cluster and you want to define a restriction policy within the limits of the current running app, you can use kube-psp-advisor maintained by Sysdig. It automatically generates the appropriate PSP.

It is also available on Krew.

kubectl krew install advise-psp

The inspect command automatically generates PSP by given Namespace where the target application is currently running. The following command creates PSP generated by the inspect result.

kubectl advise-psp inspect --namespace=YOUR_NAMESPACE | kubectl apply -f -

4. Apply PSPs

Now that we have created a privileged PSP and a restricted PSP.

(In the case of EKS) Detach a privileged PSP from system:authenticated Group.

psp-util automatically creates ClusterRoleBinding, so just remove the ClusterRoleBinding named eks:podsecuritypolicy:authenticated.

kubectl delete ClusterRoleBinding eks:podsecuritypolicy:authenticated

4-1. Apply the restricted PSP

To apply the restricted PSP to whole cluster, attach the PSP to system:authenticated Group.

Assuming that you have created a restricted PSP named restricted, Run the following command

kubectl psp-util attach restricted --group system:authenticated

See the current situation. ClusterRole and ClusterRoleBinding are automatically created.

$ kubectl psp-util tree
πŸ“™ PSP eks.privileged
└── πŸ“• ClusterRole eks:podsecuritypolicy:privileged

πŸ“™ PSP restricted
└── πŸ“• ClusterRole psp-util.restricted
    └── πŸ“˜ ClusterRoleBinding psp-util.restricted
        └── πŸ“— Subject{Kind: Group, Name: system:authenticated, Namespace:}

4-2. Apply the privileged PSP

4-2-1. For Administrative Group

Attach the privileged PSP to system:master Group, an administrative group.

kubectl psp-util attach eks.privileged --group system:master

For eks, attach the privileged PSP to eks Group.

kubectl psp-util attach eks.privileged --group eks

4-2-1. For ServiceAccounts in privileged Namespace

Attach the privileged PSP to ServiceAccounts in the privileged Namespace.

For example treat default namespace as a privileged Namespace.

Run the following command.

kubectl psp-util attach eks.privileged --group system:serviceaccounts:default

If you want to apply the privileged PSP to ServiceAccounts in other Namespace, change the above commands and execute it.

4-2-2. For ServiceAccounts in kube-system Namespace

You need to be careful for attaching the privileged PSP to kube-system. There are system controller’s ServiceAccounts such as replicaset-controller.

$ kubectl get sa -n kube-system
NAME                                 SECRETS   AGE
alb-ingress-controller               1         277d
attachdetach-controller              1         380d
aws-cloud-provider                   1         380d
aws-node                             1         380d
certificate-controller               1         380d
clusterrole-aggregation-controller   1         380d
coredns                              1         380d
cronjob-controller                   1         380d
daemon-set-controller                1         380d
default                              1         380d
deployment-controller                1         380d
disruption-controller                1         380d
eks-vpc-resource-controller          1         12d
endpoint-controller                  1         380d
expand-controller                    1         380d
generic-garbage-collector            1         380d
horizontal-pod-autoscaler            1         380d
job-controller                       1         380d
kube-proxy                           1         380d
kubernetes-route53-sync              1         87d
namespace-controller                 1         380d
node-controller                      1         380d
node-problem-detector                1         177d
persistent-volume-binder             1         380d
pod-garbage-collector                1         380d
pv-protection-controller             1         380d
pvc-protection-controller            1         380d
replicaset-controller                1         380d
replication-controller               1         380d
resourcequota-controller             1         380d
service-account-controller           1         380d
service-controller                   1         380d
statefulset-controller               1         380d
ttl-controller                       1         380d
vpc-resource-controller              1         12d

If you attach the privilleged PSP to replicaset-controller, all of the Pods created from Deployments use the privileged PSP. It is invalid configuration and therefore you must attach the privileged PSP to each ServiceAccounts.

For example, if there are only Pods created by DaemonSets and Deployments in the cluster, you can see the ServiceAccounts by following commands.

# SerivceAccounts used by DaemonSets
$ kubectl -n kube-system get ds -o jsonpath='{.items[*].spec.template.spec.serviceAccountName}'
aws-node kube-proxy

# SerivceAccounts used by Deployments
$ kubectl -n kube-system get deploy -o jsonpath='{.items[*].spec.template.spec.serviceAccountName}'
alb-ingress-controller coredns
# Attaching by one line
# DaemonSet's ServiceAccounts
for i in $(kubectl -n kube-system get ds -o jsonpath='{.items[*].spec.template.spec.serviceAccountName}'); do kubectl psp-util attach eks.privileged --sa $i -n kube-system; done

# Deployment's ServiceAccounts
for i in $(kubectl -n kube-system get deploy -o jsonpath='{.items[*].spec.template.spec.serviceAccountName}'); do kubectl psp-util attach eks.privileged --sa $i -n kube-system; done

5. Finished

That’s all. Here is the result of the configuration so far.

$ kubectl psp-util tree      
πŸ“™ PSP eks.privileged
└── πŸ“• ClusterRole eks:podsecuritypolicy:privileged
└── πŸ“• ClusterRole psp-util.eks.privileged
    └── πŸ“˜ ClusterRoleBinding psp-util.eks.privileged
        └── πŸ“— Subject{Kind: Group, Name: eks, Namespace: }
        └── πŸ“— Subject{Kind: Group, Name: system:master, Namespace: }
        └── πŸ“— Subject{Kind: ServiceAccount, Name: aws-node, Namespace: kube-system}
        └── πŸ“— Subject{Kind: ServiceAccount, Name: kube-proxy, Namespace: kube-system}
        └── πŸ“— Subject{Kind: ServiceAccount, Name: alb-ingress-controller, Namespace: kube-system}
        └── πŸ“— Subject{Kind: ServiceAccount, Name: coredns, Namespace: kube-system}
πŸ“™ PSP restricted
└── πŸ“• ClusterRole psp-util.restricted
    └── πŸ“˜ ClusterRoleBinding psp-util.restricted
        └── πŸ“— Subject{Kind: Group, Name: system:authenticated, Namespace: }
        └── πŸ“— Subject{Kind: Group, Name: system:unauthenticated, Namespace: }

If you want another PSP between privileged and restricted PSPs, you should create a dedicated PSP and apply it in the same way as you apply the privileged PSP.

However the privileged PSP has a first priority. So verify and apply it carefully.

Have a good k8s experience πŸ‘