Using Kubernetes's new Bound Service Account Tokens for secure workload identity

Cover

Security is a first-class concern for Linkerd. It plays a critical role in enhancing the overall security of the system, and this is only possible if Linkerd itself is secure. We recently added support for Kubernetes’s new bound service account tokens to Linkerd. This is a big step forward for security. But why? In order to understand that, first we need to understand how Linkerd uses service accounts.

Linkerd provides mutual TLS (mTLS) to secure communication between workloads. Central to any type of communication security is the notion of identity —as discussed in the Kubernetes engineer’s guide to mTLS, without identity you have no authenticity, and without authenticity you do not have secure communication. All of Linkerd’s mTLS magic is possible because the control plane (specifically the identity component) issues a certificate that the proxy uses to authenticate itself with other services.

But what is the identity contained in this TLS certificate? And how does Linkerd’s identity component ensure it is issuing a certificate to a proxy in the cluster and not some intruder trying to communicate with other services in the cluster? How does the control plane ensure identities of the proxies itself? We’ll answer those questions in this blog post. Let’s dive in!

Kubernetes Service Accounts

This is not just a Linkerd problem. A lot of components or K8s controllers would want to verify the identity of their clients (if they are running in the cluster or not) before providing services for them. So, Kubernetes provides service accounts that are attached to your pods by default, and can be used by the application inside to prove its identity to other components that it is part of the Kubernetes cluster. These are attached as a volume into your pod, and are mounted into the container at the /var/run/secrets/kubernetes.io/serviceaccount filepath. By default, Kubernetes attaches the default service account of the pod namespace.

spec:
  containers:
  ...
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-tsbwl
      readOnly: true
  ...
  volumes:
  - name: kube-api-access-tsbwl
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

Service accounts are also popularly used with Kubernetes RBAC to grant access to Kubernetes API Services to pods. This is done by attaching a ClusterRole (with necessary permissions) to a service account (by creating a ServiceAccount object) using a ClusterRoleBinding. Then we can specify the same service account in the serviceAccountName of your workload. This would override the default service account that is present per namespace. The default service account token has no permissions to view, list or modify any resources in the cluster.

When Kubernetes attaches the default service account token, it also attaches a configmap of the kube-root-ca.crt (as seen in the above YAML) that contains the trusted root certificate of the API server. This is used for TLS authentication with the API server when applications communicate with the API server.

Linkerd never needed any of these additional files apart from the token as it never interacts with the Kubernetes API (We will see later how bound service account tokens fixes this).

So how does Linkerd validate that its proxies are who they say they are?

For the proxy to get its certificates, it needs to verify itself with the identity component. This is done by embedding the service account token into the Certify request that is called every time a new certificate is needed (24hours by default). The identity component validates the token by talking to the TokenReview Kubernetes API and returns a CertifyResponse with the certificate only after that. The identity component not only verifies that the token is valid, but it also verifies if the token is associated with the same pod that is requesting the certificate. This can be verified by looking at the Status.User.Username in the TokenReview response. Kubernetes API sets the username to the pod name to which that token was attached.

Only the identity component in Linkerd has the necessary API access to verify tokens. Once a token is verified, the identity component issues a certificate for the proxy to use to communicate with other services.

How does Linkerd provide workload identity?

Linkerd takes a beautiful (in my mind) simplifying step here: the service accounts aren’t just used to validate that the proxies are who they say they are, they’re used as the basis of the workload’s identity itself. This gives us a workload identity that is already tied to the capabilities granted to the pod, and means that we can provide mTLS without any additional configuration! This is the secret behind Linkerd’s ability to provide on-by-default mTLS for all meshed pods.

Whenever Linkerd established a mutual TLS connection between two endpoints, the identity exchanged is that of the service account on either side. This identity is even wired into Linkerd’s metrics: whenever a meshed request is received or being sent, the relevant metrics also include the service account with which that peer was associated with.

Here is an example metric from the emojivoto example:

request_total{..., client_id="web.emojivoto.serviceaccount.identity.linkerd.cluster.local", authority="emoji-svc.emojivoto.svc.cluster.local:8080",  namespace="emojivoto", pod="emoji-696d9d8f95-5sj4j"} 14532

As you can see the client_id label in the above metric is the service account that was attached to the client pod from where the request was received.

Authorization Policy

Linkerd’s new authorization policy feature allows users to specify set of clients that can only access a set of resources. This is done by using the same identity to enable users to specify service accounts of the clients that should be allowed to talk to a group of workloads (grouped by the Server resource) in their ServerAuthorization resource.

apiVersion: policy.linkerd.io/v1beta1
kind: ServerAuthorization
metadata:
  namespace: emojivoto
  name: internal-grpc
  labels:
    app.kubernetes.io/part-of: emojivoto
    app.kubernetes.io/version: v11
spec:
  server:
    selector:
      matchLabels:
        emojivoto/api: internal-grpc
  client:
    meshTLS:
      serviceAccounts:
        - name: web

In the above example, we are permitting workloads that use the web service account to talk to the internal-grpc server.

Bound Service Account Tokens

Though all of this is great, there’s still a catch. This token is aimed at the applications to talk to the Kubernetes API and not specifically for Linkerd. Linkerd also doesn’t need those extra certs that are part of the default volume mount. This is not a security best practice. Linkerd actually gets more permissions than it really needs with default service account tokens. That’s a potential vulnerability waiting to happen. This also means that there are controls outside of Linkerd, to manage this service token, which users might want to use, causing problems with Linkerd as Linkerd might expect it to be present to do the verification. Users can also explicitly disable the token auto-mount on their pods causing problems with Linkerd. As of Linkerd 2.11, we skip pod injection if the token auto-mount is disabled.

To address these challenges, starting from edge-21.11.1 we have added the support for auto-mount bound service account tokens. Instead of using the token that is mounted by default, Linkerd will request its own set of tokens by using the Bound Service Account Tokens feature. Bound Service Account Tokens (GA as of in Kubernetes v1.20) feature allows components to request tokens for a specific service account on demand from the API server that are bound to a specific purpose (instead of the default, which is used to access the API server).

Using this, Linkerd injector will request for a token that is bound specifically for Linkerd, along with a 24h expiry (just like that of identity expiration). This token is generated for the same service account that was mounted to the pod by Kubernetes, and thus does not affect any of Linkerd’s existing functionality around identity and policy discussed above.

spec:
  containers:
  ...
    volumeMounts:
    - name: linkerd-identity-token
      mountPath: /var/run/secrets/kubernetes.io/serviceaccount
  ...
  volumes:
  - name: linkerd-identity-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: identity.l5d.io
          expirationSeconds: 86400
          path: linkerd-identity-token

As you can this token is specifically generated for Linkerd for the proxies to verify themselves with identity, and cannot be used talk to the Kubernetes API, giving us a nice separation of concerns.

Conclusion

In this post, we described the motivation for moving to Kubernetes’s new bound service account tokens, which reduce the scope of Linkerd’s access to the Kubernetes API to the bare minimum necessary to support its security features. We also uncovered some of the inner workings of how the control plane validates the proxies before issuing the certificates, and saw how Linkerd uses Kubernetes’s service accounts as a primitive to build features like authorization policy.

Our goal with Linkerd is to provide world-class security for Kubernetes users without imposing a burden on them. By relying on service accounts, we can provide on-by-default mutual TLS with zero config for all meshed pods, the moment you install Linkerd. And with bound service accounts, the implementation is even more secure than before.

Linkerd is for everyone

Linkerd is a graduated project of the Cloud Native Computing Foundation. Linkerd is committed to open governance. If you have feature requests, questions, or comments, we’d love to have you join our rapidly-growing community! Linkerd is hosted on GitHub, and we have a thriving community on Slack, Twitter, and the mailing lists. Come and join the fun!