Proxy Metrics

The Linkerd proxy exposes metrics that describe the traffic flowing through the proxy. The following metrics are available at /metrics on the proxy’s metrics port (default: :4191) in the Prometheus format.

Process-Level Metrics

proxy_build_info: A constant gauge with information on how this instance of the proxy was built, such as build date, proxy version, etc.
rustls_info: A constant gauge with information on the proxy’s TLS library, rustls. This includes the following labels:
- tls_suites: The set of cipher suites that the proxy will use for TLS connections, sorted by preference order.
- tls_kx_groups: The set of key exchange algorithms the proxy will use, sorted by preference order.
- tls_rand: The secure randomness provider
- tls_key_provider: The crytpographic key provider
- tls_fips: A boolean denoting if the proxy’s connections are FIPS-compliant.
tokio_rt_*: A set of counters and gauges with stats on the proxy’s asynchronous runtime, tokio.

Protocol-Level Metrics

request_total: A counter of the number of requests the proxy has received. This is incremented when the request stream begins.
response_total: A counter of the number of responses the proxy has received. This is incremented when the response stream ends.
response_latency_ms: A histogram of response latencies. This measurement reflects the time-to-first-byte (TTFB) by recording the elapsed time between the proxy processing a request’s headers and the first data frame of the response. If a response does not include any data, the end-of-stream event is used. The TTFB measurement is used so that Linkerd accurately reflects application behavior when a server provides response headers immediately but is slow to begin serving the response body.
route_request_total, route_response_latency_ms, and route_response_total: These metrics are analogous to request_total, response_latency_ms, and response_total except that they are collected at the route level. This means that they do not have authority, tls, grpc_status_code or any outbound labels but instead they have:
- dst: The authority of this request.
- rt_route: The name of the route for this request.
control_request_total, control_response_latency_ms, and control_response_total: These metrics are analogous to request_total, response_latency_ms, and response_total but for requests that the proxy makes to the Linkerd control plane. Instead of authority, direction, or any outbound labels, instead they have:
- addr: The address used to connect to the control plane.
inbound_http_authz_allow_total: A counter of the total number of inbound HTTP requests that were authorized.
- authz_name: The name of the authorization policy used to allow the request.
inbound_http_authz_deny_total: A counter of the total number of inbound HTTP requests that could not be processed due to being denied by the authorization policy.
inbound_http_route_not_found_total: A counter of the total number of inbound HTTP requests that could not be associated with a route.

Note that latency measurements are not exported to Prometheus until the stream completes. This is necessary so that latencies can be labeled with the appropriate response classification.

Labels

Each of these metrics has the following labels:

authority: The value of the :authority (HTTP/2) or Host (HTTP/1.1) header of the request.
direction: inbound if the request originated from outside of the pod, outbound if the request originated from inside of the pod.
tls: true if the request’s connection was secured with TLS.

Authority Label

For metrics with the direction=inbound label value, the authority label is omitted. This is done as a security measure to prevent malicious clients from being able to cause Linkerd to create an arbitrary number of metrics by sending requests with an arbitrary number of different authority values.

If this is not a concern in your environment, it is possible to re-enable the authority label on these metrics by setting an additional env value in Linkerd’s values.yml:

proxy:
  additionalEnv:
    - name: LINKERD2_PROXY_INBOUND_METRICS_AUTHORITY_LABELS
      value: unsafe

Response Labels

The following labels are only applicable on response_* metrics.

status_code: The HTTP status code of the response.

Response Total Labels

In addition to the labels applied to all response_* metrics, the response_total, route_response_total, and control_response_total metrics also have the following labels:

classification: success if the response was successful, or failure if a server error occurred. This classification is based on the gRPC status code if one is present, and on the HTTP status code otherwise.
grpc_status_code: The value of the grpc-status trailer. Only applicable for gRPC responses.

Note

Because response classification may be determined based on the grpc-status trailer (if one is present), a response may not be classified until its body stream completes. Response latency, however, is determined based on time-to-first-byte, so the response_latency_ms metric is recorded as soon as data is received, rather than when the response body ends. Therefore, the values of the classification and grpc_status_code labels are not yet known when the response_latency_ms metric is recorded.

Outbound labels

The following labels are only applicable if direction=outbound.

dst_deployment: The deployment to which this request is being sent.
dst_k8s_job: The job to which this request is being sent.
dst_replicaset: The replica set to which this request is being sent.
dst_daemonset: The daemon set to which this request is being sent.
dst_statefulset: The stateful set to which this request is being sent.
dst_replicationcontroller: The replication controller to which this request is being sent.
dst_namespace: The namespace to which this request is being sent.
dst_service: The service to which this request is being sent.
dst_pod_template_hash: The pod-template-hash of the pod to which this request is being sent. This label selector roughly approximates a pod’s ReplicaSet or ReplicationController.

Prometheus Collector labels

The following labels are added by the Prometheus collector.

instance: ip:port of the pod.
job: The Prometheus job responsible for the collection, typically linkerd-proxy.

Kubernetes labels added at collection time

Kubernetes namespace, pod name, and all labels are mapped to corresponding Prometheus labels.

namespace: Kubernetes namespace that the pod belongs to.
pod: Kubernetes pod name.
pod_template_hash: Corresponds to the pod-template-hash Kubernetes label. This value changes during redeploys and rolling restarts. This label selector roughly approximates a pod’s ReplicaSet or ReplicationController.

Linkerd labels added at collection time

Kubernetes labels prefixed with linkerd.io/ are added to your application at linkerd inject time. More specifically, Kubernetes labels prefixed with linkerd.io/proxy-* will correspond to these Prometheus labels:

daemonset: The daemon set that the pod belongs to (if applicable).
deployment: The deployment that the pod belongs to (if applicable).
k8s_job: The job that the pod belongs to (if applicable).
replicaset: The replica set that the pod belongs to (if applicable).
replicationcontroller: The replication controller that the pod belongs to (if applicable).
statefulset: The stateful set that the pod belongs to (if applicable).

Example

Here’s a concrete example, given the following pod snippet:

name: vote-bot-5b7f5657f6-xbjjw
namespace: emojivoto
labels:
  app: vote-bot
  linkerd.io/control-plane-ns: linkerd
  linkerd.io/proxy-deployment: vote-bot
  pod-template-hash: "3957278789"
  test: vote-bot-test

The resulting Prometheus labels will look like this:

request_total{
  pod="vote-bot-5b7f5657f6-xbjjw",
  namespace="emojivoto",
  app="vote-bot",
  control_plane_ns="linkerd",
  deployment="vote-bot",
  pod_template_hash="3957278789",
  test="vote-bot-test",
  instance="10.1.3.93:4191",
  job="linkerd-proxy"
}

Transport-Level Metrics

The following metrics are collected at the level of the underlying transport layer.

tcp_open_total: A counter of the total number of opened transport connections.
tcp_close_total: A counter of the total number of transport connections which have closed.
tcp_open_connections: A gauge of the number of transport connections currently open.
tcp_write_bytes_total: A counter of the total number of sent bytes. This is updated when the connection closes.
tcp_read_bytes_total: A counter of the total number of received bytes. This is updated when the connection closes.
inbound_tcp_errors_total: A counter of the total number of inbound TCP connections that could not be processed due to a proxy error.
outbound_tcp_errors_total: A counter of the total number of outbound TCP connections that could not be processed due to a proxy error.
inbound_tcp_authz_allow_total: A counter of the total number of TCP connections that were authorized.
inbound_tcp_authz_deny_total: A counter of the total number of TCP connections that were denied

Labels

Each of these metrics has the following labels:

direction: inbound if the connection was established either from outside the pod to the proxy, or from the proxy to the application, outbound if the connection was established either from the application to the proxy, or from the proxy to outside the pod.
peer: src if the connection was accepted by the proxy from the source, dst if the connection was opened by the proxy to the destination.

Note that the labels described above under the heading “Prometheus Collector labels” are also added to transport-level metrics, when applicable.

Connection Close Labels

The following labels are added only to metrics which are updated when a connection closes (tcp_close_total):

classification: success if the connection terminated cleanly, failure if the connection closed due to a connection failure.

Identity Metrics

identity_cert_expiration_timestamp_seconds: A gauge of the time when the proxy’s current mTLS identity certificate will expire (in seconds since the UNIX epoch).
identity_cert_refresh_count: A counter of the total number of times the proxy’s mTLS identity certificate has been refreshed by the Identity service.

Endpoint Metrics

When performing policy-based routing, proxies may dispatch requests through per-route backend configurations. See the Authorization Policy overview and reference pages for more information on how to configure policy-based routing.

The Linkerd proxy emits metrics that provide visibility into authorized HTTP and gRPC traffic. Route-level metrics measure traffic for all of a policy’s associated backends, while backend-level metrics measure the traffic distributed to individual endpoints.

The outbound proxy records the following metrics:

outbound_http_route_request_duration_seconds: A histogram measuring the time between HTTP request initialization and HTTP response completion.
outbound_http_route_request_statuses_total: A counter tracking HTTP response status codes for HTTP traffic sent to a route.
outbound_http_route_request_frame_size_bytes: A histogram measuring the sizes of DATA frames in HTTP response bodies for a route.
outbound_grpc_route_request_duration_seconds: A histogram measuring the time between gRPC request initialization and gRPC response completion.
outbound_grpc_route_request_statuses_total: A counter tracking gRPC response status codes for gRPC traffic sent to a GRPCRoute.
outbound_grpc_route_request_frame_size_bytes: A histogram measuring the sizes of DATA frames in gRPC response bodies for a route.
outbound_http_route_backend_requests_total: A counter tracking the total number of outbound HTTP requests dispatched to a particular backend.
outbound_http_route_backend_response_duration_seconds: A histogram measuring the time in seconds between the HTTP request completing and HTTP response completing, for a particular backend.
outbound_http_route_backend_response_statuses_total: A counter tracking HTTP responses from a particular backend, labeled by status code.
outbound_http_route_backend_response_frame_size_bytes: A histogram measuring the sizes of DATA frames in HTTP response bodies from a particular backend.
outbound_grpc_route_backend_requests_total: A counter tracking the total number of outbound gRPC requests dispatched to a particular backend.
outbound_grpc_route_backend_response_duration_seconds: A histogram measuring the time in seconds between the gRPC request completing and gRPC response completing, for traffic dispatched to a particular backend.
outbound_grpc_route_backend_response_statuses_total: A counter tracking gRPC responses from a particular backend, labeled by the grpc-status code.
outbound_grpc_route_backend_response_frame_size_bytes: A histogram measuring the sizes of DATA frames in gRPC response bodies from a particular backend.

The inbound proxy records the following metrics:

inbound_http_requests_total: A counter tracking the total number of inbound HTTP requests received by a particular backend.
inbound_grpc_requests_total: A counter tracking the total number of inbound gRPC requests received by a particular backend.
inbound_http_statuses_total: A counter tracking HTTP response status codes for HTTP traffic received by a particular backend.
inbound_grpc_statuses_total: A counter tracking gRPC response status codes for gRPC traffic received by a particular backend.
inbound_http_request_duration_seconds: A histogram measuring the time between HTTP request initialization and HTTP response completion.
inbound_http_response_duration_seconds: A histogram measuring the time in seconds between the HTTP request completing and HTTP response completing, for a particular backend.
inbound_grpc_request_duration_seconds: A histogram measuring the time between gRPC request initialization and gRPC response completion.
inbound_grpc_response_duration_seconds: A histogram measuring the time in seconds between the gRPC request completing and gRPC response completing, for a particular backend.
inbound_http_request_frame_size_bytes: A histogram measuring the sizes of DATA frames in HTTP request bodies for a particular route.
inbound_http_response_frame_size_bytes: A histogram measuring the sizes of DATA frames in HTTP response bodies for a particular route.
inbound_grpc_request_frame_size_bytes: A histogram measuring the sizes of DATA frames in gRPC request bodies for a particular route.
inbound_grpc_response_frame_size_bytes: A histogram measuring the sizes of DATA frames in gRPC response bodies for a particular route.

Labels

Each of these metrics has the following common labels, which describe the Kubernetes resources to which traffic is routed by the proxy:

parent_group, parent_kind, parent_name, and parent_namespace reference the parent resource through which the proxy discovered the route binding. The parent resource of an HTTPRoute is generally a Service.
route_group, route_kind, route_name, and route_namespace reference the route resource through which the proxy discovered the route binding. This will either reference an HTTPRoute resource or a default (synthesized) route.
backend_group, backend_kind, backend_name, and backend_namespace reference the backend resource to which which the proxy routed the request. This will always be a Service.

In addition, the outbound_http_balancer_endpoints gauge metric adds the following labels:

endpoint_state: Either “ready” if the endpoint is available to have requests routed to it by the load balancer, or “pending” if the endpoint is currently unavailable.
Endpoints may be “pending” when a connection is being established (or reestablished), or when the endpoint has been made unavailable by failure accrual.