Circuit breaking is a pattern for improving the reliability of distributed applications. In circuit breaking, an application which makes network calls to remote backends monitors whether those calls succeed or fail, in an attempt to determine whether that backend is in a failed state. If a given backend is believed to be in a failed state, its circuit breaker is “tripped”, and no subsequent requests are sent to that backend until it is determined to have returned to normal.
The Linkerd proxy is capable of performing endpoint-level circuit breaking on HTTP requests using a configurable failure accrual strategy. This means that the Linkerd proxy performs circuit breaking at the level of individual endpoints in a load balancer (i.e., each Pod in a given Service), and failures are tracked at the level of HTTP response status codes.
Circuit breaking is a client-side behavior, and is therefore performed by the
outbound side of the Linkerd proxy.1 Outbound proxies implement circuit
breaking in the load balancer, by marking failing endpoints as unavailable.
When an endpoint is unavailable, the load balancer will not select it when
determining where to send a given request. This means that if only some
endpoints have tripped their circuit breakers, the proxy will simply not select
those endpoints while they are in a failed state. When all endpoints in a load
balancer are unavailable, requests may be failed with 503 Service Unavailable
errors, or, if the Service is one of multiple
backendRefs in an
HTTPRoute, the entire backend Service will be
considered unavailable and a different backend may be selected.
outbound_http_balancer_endpoints gauge metric reports the number
of “ready” and “pending” endpoints in a load balancer, with the “pending” number
including endpoints made unavailable by failure accrual.
Failure Accrual Policies
A failure accrual policy determines how failures are tracked for endpoints, and what criteria result in an endpoint becoming unavailable (“tripping the circuit breaker”). Currently, the Linkerd proxy implements one failure accrual policy, consecutive failures. Additional failure accrual policies may be added in the future.
In this failure accrual policy, an endpoint is marked as failing after a configurable number of failures occur consecutively (i.e., without any successes). For example, if the maximum number of failures is 7, the endpoint is made unavailable once 7 failures occur in a row with no successes.
Probation and Backoffs
Once a failure accrual policy makes an endpoint unavailble, the circuit breaker will attempt to determine whether the endpoint is still in a failing state, and transition it back to available if it has recovered. This process is called probation. When an endpoint enters probation, it is temporarily made available to the load balancer again, and permitted to handle a single request, called a probe request. If this request succeeds, the endpoint is no longer considered failing, and is once again made available. If the probe request fails, the endpoint remains unavailable, and another probe request will be issued after a backoff.
When an endpoint’s failure accrual policy trips the circuit breaker, it will remain unavailble for at least a minimum penalty duration. After this duration has elapsed, the endpoint will enter probation. When a probe request fails, the endpoint will not be placed in probation again until a backoff duration has elapsed. Every time a probe request fails, the backoff increases exponentially, up to an upper bound set by the maximum penalty duration.
An amount of random noise, called jitter, is added to each backoff duration. Jitter is controlled by a parameter called the jitter ratio, a floating-point number from 0.0 to 100.0, which represents the maximum percentage of the original backoff duration which may be added as jitter.
Configuring Failure Accrual
HTTP failure accrual is configured by a set of annotations. When these annotations are added to a Kubernetes Service, client proxies will perform HTTP failure accrual when communicating with endpoints of that Service. If no failure accrual annotations are present on a Service, proxies will not perform failure accrual.
Set this annotation on a Service to enable meshed clients to use circuit breaking when sending traffic to that Service:
balancer.linkerd.io/failure-accrual: Selects the failure accrual policy used when communicating with this Service. If this is not present, no failure accrual is performed. Currently, the only supported value for this annotation is
"consecutive", to perform consecutive failures failure accrual.
When the failure accrual mode is
"consecutive", the following annotations
configure parameters for the consecutive-failures failure accrual policy:
balancer.linkerd.io/failure-accrual-consecutive-max-failures: Sets the number of consecutive failures which must occur before an endpoint is made unavailable. Must be an integer. If this annotation is not present, the default value is 7.
balancer.linkerd.io/failure-accrual-consecutive-min-penalty: Sets the minumum penalty duration for which an endpoint will be marked as unavailable after
max-failuresconsecutive failures occur. After this period of time elapses, the endpoint will be probed. This duration must be non-zero, and may not be greater than the max-penalty duration. If this annotation is not present, the default value is one second (
balancer.linkerd.io/failure-accrual-consecutive-max-penalty: Sets the maximum penalty duration for which an endpoint will be marked as unavailable after
max-failuresconsecutive failures occur. This is an upper bound on the duration between probe requests. This duration must be non-zero, and must be greater than the min-penalty duration. If this annotation is not present, the default value is one minute (
balancer.linkerd.io/failure-accrual-consecutive-jitter-ratio: Sets the jitter ratio used for probation backoffs. This is a floating-point number, and must be between 0.0 and 100.0. If this annotation is not present, the default value is 0.5.
The part of the proxy which handles connections from within the pod to the rest of the cluster. ↩︎