Troubleshooting

This section provides resolution steps for common problems reported with the linkerd check command.

The “pre-kubernetes-cluster-setup” checks

These checks only run when the --pre flag is set. This flag is intended for use prior to running linkerd install, to verify your cluster is prepared for installation.

√ control plane namespace does not already exist

Example failure:

× control plane namespace does not already exist
    The "linkerd" namespace already exists

By default linkerd install will create a linkerd namespace. Prior to installation, that namespace should not exist. To check with a different namespace, run:

linkerd check --pre --linkerd-namespace linkerd-test

√ can create Kubernetes resources

The subsequent checks in this section validate whether you have permission to create the Kubernetes resources required for Linkerd installation, specifically:

√ can create Namespaces
√ can create ClusterRoles
√ can create ClusterRoleBindings
√ can create CustomResourceDefinitions

For more information on cluster access, see the GKE Setup section above.

The “pre-kubernetes-setup” checks

These checks only run when the --pre flag is set This flag is intended for use prior to running linkerd install, to verify you have the correct RBAC permissions to install Linkerd.

√ can create ServiceAccounts
√ can create Services
√ can create Deployments
√ can create ConfigMaps

For more information on cluster access, see the GKE Setup section above.

√ no clock skew detected

This check verifies whether there is clock skew between the system running the linkerd install command and the Kubernetes node(s), causing potential issues.

The “pre-kubernetes-capability” checks

These checks only run when the --pre flag is set. This flag is intended for use prior to running linkerd install, to verify you have the correct Kubernetes capability permissions to install Linkerd.

√ has NET_ADMIN capability

Example failure:

× has NET_ADMIN capability
    found 3 PodSecurityPolicies, but none provide NET_ADMIN
    see https://linkerd.io/checks/#pre-k8s-cluster-net-admin for hints

Linkerd installation requires the NET_ADMIN Kubernetes capability, to allow for modification of iptables.

For more information, see the Kubernetes documentation on Pod Security Policies, Security Contexts, and the man page on Linux Capabilities.

The “pre-kubernetes-single-namespace-setup” checks

If you do not expect to have the permission for a full cluster install, try the --single-namespace flag, which validates if Linkerd can be installed in a single namespace, with limited cluster access:

linkerd check --pre --single-namespace

√ control plane namespace exists

× control plane namespace exists
    The "linkerd" namespace does not exist

In --single-namespace mode, linkerd check assumes that the installer does not have permission to create a namespace, so the installation namespace must already exist.

By default the linkerd namespace is used. To use a different namespace run:

linkerd check --pre --single-namespace --linkerd-namespace linkerd-test

√ can create Kubernetes resources

The subsequent checks in this section validate whether you have permission to create the Kubernetes resources required for Linkerd --single-namespace installation, specifically:

√ can create Roles
√ can create RoleBindings

For more information on cluster access, see the GKE Setup section above.

The “kubernetes-api” checks

Example failures:

× can initialize the client
    error configuring Kubernetes API client: stat badconfig: no such file or directory
× can query the Kubernetes API
    Get https://8.8.8.8/version: dial tcp 8.8.8.8:443: i/o timeout

Ensure that your system is configured to connect to a Kubernetes cluster. Validate that the KUBECONFIG environment variable is set properly, and/or ~/.kube/config points to a valid cluster.

For more information see these pages in the Kubernetes Documentation:

Also verify that these command works:

kubectl config view
kubectl cluster-info
kubectl version

Another example failure:

✘ can query the Kubernetes API
    Get REDACTED/version: x509: certificate signed by unknown authority

As an (unsafe) workaround to this, you may try:

kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \
    --server=${KUBE_CONTEXT}

The “kubernetes-version” checks

√ is running the minimum Kubernetes API version

Example failure:

× is running the minimum Kubernetes API version
    Kubernetes is on version [1.7.16], but version [1.10.0] or more recent is required

Linkerd requires at least version 1.10.0. Verify your cluster version with:

kubectl version

√ is running the minimum kubectl version

Example failure:

× is running the minimum kubectl version
    kubectl is on version [1.9.1], but version [1.10.0] or more recent is required
    see https://linkerd.io/checks/#kubectl-version for hints

Linkerd requires at least version 1.10.0. Verify your kubectl version with:

kubectl version --client --short

To fix please update kubectl version.

For more information on upgrading Kubernetes, see the page in the Kubernetes Documentation on Upgrading a cluster

The “linkerd-config” checks

This category of checks validates that Linkerd’s cluster-wide RBAC and related resources have been installed. These checks run via a default linkerd check, and also in the context of a multi-stage setup, for example:

# install cluster-wide resources (first stage)
linkerd install config | kubectl apply -f -

# validate successful cluster-wide resources installation
linkerd check config

# install Linkerd control plane
linkerd install control-plane | kubectl apply -f -

# validate successful control-plane installation
linkerd check

√ control plane Namespace exists

Example failure:

× control plane Namespace exists
    The "foo" namespace does not exist
    see https://linkerd.io/checks/#l5d-existence-ns for hints

Ensure the Linkerd control plane namespace exists:

kubectl get ns

The default control plane namespace is linkerd. If you installed Linkerd into a different namespace, specify that in your check command:

linkerd check --linkerd-namespace linkerdtest

√ control plane ClusterRoles exist

Example failure:

× control plane ClusterRoles exist
    missing ClusterRoles: linkerd-linkerd-controller
    see https://linkerd.io/checks/#l5d-existence-cr for hints

Ensure the Linkerd ClusterRoles exist:

$ kubectl get clusterroles | grep linkerd
linkerd-linkerd-controller                                             9d
linkerd-linkerd-identity                                               9d
linkerd-linkerd-prometheus                                             9d
linkerd-linkerd-proxy-injector                                         20d
linkerd-linkerd-sp-validator                                           9d

Also ensure you have permission to create ClusterRoles:

$ kubectl auth can-i create clusterroles
yes

√ control plane ClusterRoleBindings exist

Example failure:

× control plane ClusterRoleBindings exist
    missing ClusterRoleBindings: linkerd-linkerd-controller
    see https://linkerd.io/checks/#l5d-existence-crb for hints

Ensure the Linkerd ClusterRoleBindings exist:

$ kubectl get clusterrolebindings | grep linkerd
linkerd-linkerd-controller                             9d
linkerd-linkerd-identity                               9d
linkerd-linkerd-prometheus                             9d
linkerd-linkerd-proxy-injector                         20d
linkerd-linkerd-sp-validator                           9d

Also ensure you have permission to create ClusterRoleBindings:

$ kubectl auth can-i create clusterrolebindings
yes

√ control plane ServiceAccounts exist

Example failure:

× control plane ServiceAccounts exist
    missing ServiceAccounts: linkerd-controller
    see https://linkerd.io/checks/#l5d-existence-sa for hints

Ensure the Linkerd ServiceAccounts exist:

$ kubectl -n linkerd get serviceaccounts
NAME                     SECRETS   AGE
default                  1         23m
linkerd-controller       1         23m
linkerd-grafana          1         23m
linkerd-identity         1         23m
linkerd-prometheus       1         23m
linkerd-proxy-injector   1         7m
linkerd-sp-validator     1         23m
linkerd-web              1         23m

Also ensure you have permission to create ServiceAccounts in the Linkerd namespace:

$ kubectl -n linkerd auth can-i create serviceaccounts
yes

√ control plane CustomResourceDefinitions exist

Example failure:

× control plane CustomResourceDefinitions exist
    missing CustomResourceDefinitions: serviceprofiles.linkerd.io
    see https://linkerd.io/checks/#l5d-existence-crd for hints

Ensure the Linkerd CRD exists:

$ kubectl get customresourcedefinitions
NAME                         CREATED AT
serviceprofiles.linkerd.io   2019-04-25T21:47:31Z

Also ensure you have permission to create CRDs:

$ kubectl auth can-i create customresourcedefinitions
yes

The “linkerd-existence” checks

√ controller pod is running

Example failure:

× controller pod is running
    No running pods for "linkerd-controller"

Note, it takes a little bit for pods to be scheduled, images to be pulled and everything to start up. If this is a permanent error, you’ll want to validate the state of the controller pod with:

$ kubectl -n linkerd get po --selector linkerd.io/control-plane-component=controller
NAME                                  READY     STATUS    RESTARTS   AGE
linkerd-controller-7bb8ff5967-zg265   4/4       Running   0          40m

Check the controller’s logs with:

linkerd logs --control-plane-component controller

√ can initialize the client

Example failure:

× can initialize the client
    parse http:// bad/: invalid character " " in host name

Verify that a well-formed --api-addr parameter was specified, if any:

linkerd check --api-addr " bad"

√ can query the control plane API

Example failure:

× can query the control plane API
    Post http://8.8.8.8/api/v1/Version: context deadline exceeded

This check indicates a connectivity failure between the cli and the Linkerd control plane. To verify connectivity, manually connect to the controller pod:

kubectl -n linkerd port-forward \
    $(kubectl -n linkerd get po \
        --selector=linkerd.io/control-plane-component=controller \
        -o jsonpath='{.items[*].metadata.name}') \
9995:9995

…and then curl the /metrics endpoint:

curl localhost:9995/metrics

The “linkerd-api” checks

√ control plane pods are ready

Example failure:

× control plane pods are ready
    No running pods for "linkerd-web"

Verify the state of the control plane pods with:

$ kubectl -n linkerd get po
NAME                                      READY     STATUS    RESTARTS   AGE
pod/linkerd-controller-b8c4c48c8-pflc9    4/4       Running   0          45m
pod/linkerd-grafana-776cf777b6-lg2dd      2/2       Running   0          1h
pod/linkerd-prometheus-74d66f86f6-6t6dh   2/2       Running   0          1h
pod/linkerd-web-5f6c45d6d9-9hd9j          2/2       Running   0          3m

√ control plane self-check

Example failure:

× control plane self-check
    Post https://localhost:6443/api/v1/namespaces/linkerd/services/linkerd-controller-api:http/proxy/api/v1/SelfCheck: context deadline exceeded

Check the logs on the control-plane’s public API:

linkerd logs --control-plane-component controller --container public-api

√ [kubernetes] control plane can talk to Kubernetes

Example failure:

× [kubernetes] control plane can talk to Kubernetes
    Error calling the Kubernetes API: FAIL

Check the logs on the control-plane’s public API:

linkerd logs --control-plane-component controller --container public-api

√ [prometheus] control plane can talk to Prometheus

Example failure:

× [prometheus] control plane can talk to Prometheus
    Error calling Prometheus from the control plane: FAIL

Validate that the Prometheus instance is up and running:

kubectl -n linkerd get all | grep prometheus

Check the Prometheus logs:

linkerd logs --control-plane-component prometheus

Check the logs on the control-plane’s public API:

linkerd logs --control-plane-component controller --container public-api

The “linkerd-service-profile” checks

Example failure:

‼ no invalid service profiles
    ServiceProfile "bad" has invalid name (must be "<service>.<namespace>.svc.cluster.local")

Validate the structure of your service profiles:

$ kubectl -n linkerd get sp
NAME                                               AGE
bad                                                51s
linkerd-controller-api.linkerd.svc.cluster.local   1m

The “linkerd-version” checks

√ can determine the latest version

Example failure:

× can determine the latest version
    Get https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli: context deadline exceeded

Ensure you can connect to the Linkerd version check endpoint from the environment the linkerd cli is running:

$ curl "https://versioncheck.linkerd.io/version.json?version=edge-19.1.2&uuid=test-uuid&source=cli"
{"stable":"stable-2.1.0","edge":"edge-19.1.2"}

√ cli is up-to-date

Example failure:

‼ cli is up-to-date
    is running version 19.1.1 but the latest edge version is 19.1.2

See the page on Upgrading Linkerd.

The “control-plane-version” checks

Example failures:

‼ control plane is up-to-date
    is running version 19.1.1 but the latest edge version is 19.1.2
‼ control plane and cli versions match
    mismatched channels: running stable-2.1.0 but retrieved edge-19.1.2

See the page on Upgrading Linkerd.

The “linkerd-data-plane” checks

These checks only run when the --proxy flag is set. This flag is intended for use after running linkerd inject, to verify the injected proxies are operating normally.

√ data plane namespace exists

Example failure:

$ linkerd check --proxy --namespace foo
...
× data plane namespace exists
    The "foo" namespace does not exist

Ensure the --namespace specified exists, or, omit the parameter to check all namespaces.

√ data plane proxies are ready

Example failure:

× data plane proxies are ready
    No "linkerd-proxy" containers found

Ensure you have injected the Linkerd proxy into your application via the linkerd inject command.

For more information on linkerd inject, see Step 5: Install the demo app in our Getting Started guide.

√ data plane proxy metrics are present in Prometheus

Example failure:

× data plane proxy metrics are present in Prometheus
    Data plane metrics not found for linkerd/linkerd-controller-b8c4c48c8-pflc9.

Ensure Prometheus can connect to each linkerd-proxy via the Prometheus dashboard:

kubectl -n linkerd port-forward svc/linkerd-prometheus 9090

…and then browse to http://localhost:9090/targets, validate the linkerd-proxy section.

You should see all your pods here. If they are not:

  • Prometheus might be experiencing connectivity issues with the k8s api server. Check out the logs and delete the pod to flush any possible transient errors.

√ data plane is up-to-date

Example failure:

‼ data plane is up-to-date
    linkerd/linkerd-prometheus-74d66f86f6-6t6dh: is running version 19.1.2 but the latest edge version is 19.1.3

See the page on Upgrading Linkerd.

√ data plane and cli versions match

‼ data plane and cli versions match
    linkerd/linkerd-web-5f6c45d6d9-9hd9j: is running version 19.1.2 but the latest edge version is 19.1.3

See the page on Upgrading Linkerd.