Linkerd 2.14 introduces several new features, including flat-network multicluster, workload identity across clusters, Gateway API conformance, timeout support in HTTPRoute, and (of course) many bugfixes. Let’s take a closer look at some of the highlights.
In Linkerd 2.13 and earlier, Linkerd’s multicluster functionality worked by channeling traffic through multicluster gateways.
This mechanism is simple and functions very well as long as the gateways have IP connectivity, but it has two significant caveats:
Added Cost: Funnelling traffic through the gateways adds latency (though not very much in Linkerd’s case). Additionally, the gateways themselves often require network load balancers, which can add expense.
Lost Identity: When a workload in one cluster calls a workload in a different cluster, the workload in the second cluster sees the request originating from its gateway, not from the workload in the first cluster. This makes it impossible to properly implement authorization policy across clusters.
The solution in 2.14? Completely bypass the gateways.
This new approach brings some real benefits to the table, all stemming from the fact that now workload 1 is simply making a normal mTLS connection directly to workload 2. This means that workload identity is preserved across cluster boundaries, since the request to workload 2 comes directly from workload 1. In turn, this means that authorization policy just works across cluster boundaries, as does all the rest of Linkerd’s functionality related to mTLS and identity.
Additionally, this direct pod-to-pod communication reduces latency and also means that there are fewer components to fail in the network path. There’s also no need for load balancers, which can directly cut costs.
To use this new capability, you start by installing the Linkerd multicluster extension with gateways disabled:
linkerd multicluster install --gateway=false | kubectl apply -f -
and then also add
--gateway=false when linking clusters, e.g.:
linkerd --context=us-west multicluster link \ --cluster-name us-west \ --gateway=false \ | kubectl --context=us-east apply -f -
Note that even though the gateways are no longer in play, you’ll still see the Link resources.
Finally, when you label Services for export across clusters, you’ll use
remote-discovery value for this label triggers Linkerd to not use
gateways, and instead directly communicate between the two clusters’ control
planes to handle Service mirroring across clusters.
Of course, there are still points to be aware of. The most critical is also the most obvious: you must have a flat network in which each cluster has its own unique CIDR range, and Pods in one cluster can talk directly to Pods in other clusters. How tricky this is to arrange will depend on your cluster provider, but it’s always possible!
Additionally - as always with multicluster operations in Linkerd - your clusters will all need a shared trust anchor. We generally recommend that each cluster have a distinct identity trust domain (and thus cluster domain), as this can considerably simplify cross-cluster authorization policy.
Gateway API Improvements
Starting in Linkerd 2.12, the Gateway API became the central configuration mechanism for talking about classes of HTTP traffic (including gRPC), including functionality such as authentication policy (Linkerd 2.12) and dynamic request routing (Linkerd 2.13). In the long run, Gateway API is expected to replace SMI and ServicePolicy, though there’s no defined timetable for that yet.
A challenge when implementing Gateway API as a service mesh is that the conformance tests for Gateway API were originally defined assuming that any implementation being tested for conformance included a gateway controller that handled north/south traffic. Since Linkerd relies on external ingress controllers for north/south traffic rather than bundling its own, there was no way that Linkerd could be conformant.
This situation changes in Gateway API v0.8.0, just released on August 29,
2023, which adds the concept of conformance profiles to Gateway API. These
named profiles define subsets of Gateway API and allow implementations to
choose (and document) the subsets to which they conform. One of the profiles
defined by v0.8.0 is the
Mesh profile, which checks only service mesh
functionality, and we’re delighted to announce that Linkerd 2.14 is fully
conformant with this new
A major component of Gateway API conformance is, of course, the APIGroup used
for Gateway API resources. Since Linkerd 2.13 and earlier could not be
conformant, by definition, they copied the HTTPRoute resource into the
policy.linkerd.io APIGroup. Linkerd 2.14 adds support for the official
Gateway API group,
gateway.networking.k8s.io. New installations should
gateway.networking.k8s.io HTTPRoutes (with one notable
exception, described below), but of course support for
will remain for the foreseeable future.
One more thing that Linkerd 2.14 adds is support for timeouts in HTTPRoutes:
apiVersion: policy.linkerd.io/v1beta3 kind: HTTPRoute metadata: name: timeout-example spec: ... rules: - backendRefs: - name: some-service port: 8080 timeouts: request: 10s backendRequest: 3s
The two types of timeouts have different purposes:
timeouts.requestsets the end-to-end timeout for the entire request, including retries, etc., and
timeouts.backendRequestsets the timeout for just the communications from the proxies to the workload.
Either timeout can be set alone, or both can be set together (in which case
timeouts.backendRequest must be less than
timeouts.request). To disable a
timeout, set it to
0s – not setting a timeout leaves Linkerd to enforce
Both timeouts use the syntax of Go’s
time.ParseDuration, with the exception
that floating-point numbers are not allowed (so
1h30m is legal, but
is not). The gory details are specified in GEP-2257 and GEP-1742, but it’s
worth noting in particular that the zero value for “no timeout” must be
0s, not simply
There’s one final important point about timeouts: Linkerd 2.14 only supports
policy.linkerd.io/v1beta3 HTTPRoutes, because GEP-1742 didn’t
quite make it into Gateway API v0.8.0 (it’s slated for inclusion in Gateway
API v1.0.0). This is the one reason you might still need to use
policy.linkerd.io HTTPRoutes instead of switching to
Linkerd has a long tradition of easily providing security, reliability, and observability, while maintaining operational simplicity and performance. Linkerd 2.14 continues this tradition, adding important new capabilities like flat-network multicluster and enhanced Gateway API functionality in order to address the needs of users from small startups to huge enterprises.
If you found this interesting, check out the Service Mesh Academy workshop on
What’s Coming in Linkerd
where you can see the hands-on demo of everything I’ve talked about here! And,
as always, feedback is always welcome – you can find me as
@flynn on the