This is the third day of my participation in the November Gwen Challenge. Check out the details: the last Gwen Challenge 2021

Currently, there are mainly several Service Mesh frameworks in the industry. The following describes and compares them in detail.

1, Linkerd

Linkerd is Buoyant’s first open source high-performance network agent, the industry’s first Service Mesh framework, launched in 2016. It is mainly used to solve some problems of communication between services in distributed environment, such as network unreliability, insecurity, delay packet loss and so on.

Linkerd is written in Scala, runs on the JVM, and is based on and extended to Twitter’s Finagle library. Most importantly, Linkerd is fast, lightweight, and high performance. It handles tens of thousands of requests per second with minimal latency and load, making it easy to scale horizontally. In addition, there are the following functions:

  • Multi-platform support: Runs on multiple platforms, for exampleKubernetes,DC/OS,DockerOr even virtual machines or physical machines.
  • Seamlessly integrate multiple service discovery tools.
  • Supports multiple protocols, such asgRPC,HTTP/1.x,HTTP/2And even throughlinkerd-tcpsupportTCPThe agreement.
  • Support with third party distributed tracking systemZipkinIntegration.
  • With high flexibility and expansibility, you can develop custom plug-ins through its interface.

Currently, Linkerd and Linkerd2 are developed in parallel as follows:

  • Linkerd:LinkerdUse the * *ScalaLanguage preparation **, run inJVM, underlying based on TwitterFinagleLibrary, and made the corresponding extension to it.
  • Linkerd2Use:GoLanguage andRustThe language was completely rewrittenLinkerdUsed exclusively forKubernetes.

Linkerd itself is the data plane, responsible for routing data to target services while ensuring safe, reliable, and fast data transfer in a distributed environment. In addition, Linkerd includes the control plane component Namerd, which provides centralized management and storage of routing rules, centralized management service discovery configuration, support for dynamic routing at run time, and exposure of the Namerd API management interface.

Figure 3.2.1: Linkerd architecture diagram

  • Control plane

    Is a set of services that run in a Kubernetes-specific namespace. These services can do a variety of things: aggregate telemetry data, provide user-facing apis, provide control data to data plane agents, and so on.

    It consists of the following parts:

    • ControllerBy:public-apiThe container is composed ofCLIanddashboardProvide interface apis.
    • DestinationEach agent in the data plane uses this component to find out where to send requests. It is also used to obtain service configuration information, such as routing indicator, retry, and timeout.
    • IdentityThis component provides the issuance of certificates, accepting certificates from agentsCSRsAnd returns the certificate with the correct identity signature. These certificates are acquired by the agent at startup and must be issued before the agent is ready. Subsequently, they can be usedLinkerdAny connection between agents to implementmTLS.
    • Proxy Injector: is an injector, created one at a timepod“, it will receive onewebhookThe request. The injector checks the resource for a specificLinkerdThe annotations (linkerd.io/inject: enabled). When the annotation exists, the injector changes the container specification and addsinitContainerContains both the agent’s own and ancillary tools.
    • Service Profile Validator: used to validate a new service configuration file before saving it.
    • TapFrom:CLIanddashboardReceive requests to monitor requests and responses in real time.
  • The data plane

    Consists of lightweight agents deployed as sidecar containers with each instance of service code. In order to “add” a service to the Linkerd service grid, the Pod for that service must be redeployed to include the data plane agent in each Pod.

2, Envoy

Like Linkerd, Envoy is a high-performance web agent, open source by Lyft in October 2016, designed for cloud-native applications to act as a border portal to handle external traffic and, in addition, as an internal proxy for reliable communication between services. The implementation of Envoy draws on the experience of existing product-grade proxies and load balancers such as Nginx, HAProxy, hardware load balancer, and cloud load balancer. Written in C++ and produced by Lyft, the Envoy performs well and is stable.

Envoy can run either as a standalone agent layer or as a data plane layer within the Service Mesh architecture, so it typically runs alongside services to abstract the application’s network capabilities. Envoy provides common network capabilities, implementing platform and language limitations. In addition, there are the following functions:

  • Priority supportHTTP/2andgRPC, while supportingWebsocketAnd TCP proxy.
  • The API-driven configuration management mode supports dynamic management, configuration update, and hot restart without connection or request loss.
  • L3/L4Layer filter formationEnvoyCore connection management functions.
  • By integrating with a variety of indicator collection tools and distributed tracking systems, it can realize runtime indicator collection and distributed tracking and provide runtime visibility of the whole system and services.
  • The memory usage is low.SidecarisEnvoyThe most common deployment pattern.

3, Istio

Istio is an open source Service Mesh framework sponsored by Google, IBM, and Lyft. The project was launched in 2017 and version 1.0 was released in July 2018.

Istio is a typical implementation of a Service Mesh. If Sidecar is the data surface of the entire Service Mesh, Istio mainly makes more improvements on the control surface. Istio uses Envoy as Sidecar, and all the control surface relations are written in Golang. There has been a significant improvement in performance.

IstioThe first is a service grid, butIstioIt’s not just the service grid: inLinkerd.EnvoyOn a typical service grid like this,IstioProvides a complete solution that provides behavioral insight and operational control for the entire service grid to meet the diverse needs of microservice applications.

Istio provides many key functions uniformly across the service network:

  • Traffic management: Controlling the flow of traffic between services and API calls to make the calls more reliable and make the network more robust in bad situations.

  • Observability: Understanding the dependencies between services and the nature and direction of traffic between them provides the ability to quickly identify problems.

  • Policy enforcement: Apply organizational policies to interactions between services to ensure that access policies are enforced and resources are well distributed among consumers. Policy changes are made by configuring the grid rather than modifying the application code.

  • Service identity and security: Provide verifiable identities for services in the grid and the ability to protect service traffic so that it can be moved across networks with different levels of confidence.

  • In addition, Istio is designed for scalability to meet different deployment needs.

  • Platform support: Istio is designed to run in a variety of environments, including cross cloud, preset, Kubernetes, Mesos, and more. Initially focused on Kubernetes, but will soon support other environments.

  • Integration and customization: Policy execution components can be extended and customized to integrate with existing solutions for ACLs, logging, monitoring, quotas, auditing, and more.

These capabilities greatly reduce the coupling between application code, underlying platforms, and policies, making microservices easier to implement.

Figure 3.2.2: Istio architecture diagram

The functions of each sub-module in the Istio architecture diagram are as follows:

  • Envoy: Responsible for communication between application services.

  • Pilot: Manages and configure envoys that provide service discovery, load balancing, and intelligent routing, ensuring elastic services (service timeouts, retries, circuit breaker policies).

  • Mixer: Check information monitoring.

  • Istio-auth: Provides authentication services between services, users and services, implements access control, and solves the problem of who accesses which API.

The communication proxy component in the image is Envoy, which Istio natively introduces, but Linkerd can also integrate with Istio.

4, Conduit

Conduit was released in December 2017 as another open source project sponsored by Buoyant following Linkerd as a standalone version of Linkerd for Kubernetes. Conduit aims to radically simplify the complexity of using the service grid in Kubernetes and improve the user experience, rather than optimizing for a variety of platforms like Linkerd.

The main goals of Conduit are lightweight, high performance, secure, and very easy to understand and use. Conduit, like Linkerd and Istio, also contains a data plane and a control plane, where the data plane was developed by Rust, making Conduit use very few memory resources, and the control plane was developed by Go. Conduit still supports the functionality required for Service Mesh, but also includes the following features:

  • Super lightweight and extremely fast performance.
  • Focus on supportKubernetesPlatform to improve running inKubernetesReliability, visibility, and security of services on the platform.
  • supportgRPC,HTTP/2andHTTP/1.xRequest and all TCP traffic.

Conduit is a minimalist architecture centered on the concept of zero configuration, designed to reduce user interaction with Conduit and implement it out of the box.

5. Comparison and summary

The following is a brief comparison and summary of the above various Service Mesh frameworks, as shown in the following table:

function

Linkerd

Envoy

Istio

Conduit

The agent

Finagle + Jetty

Envoy

Envoy

Conduit

fusing

Support. Connection-based Fuses Fast Fail and Request-based Fuses Failure Accrual.

Support. By setting specific criteria, such as maximum number of connections, maximum number of requests, maximum number of pending requests, or maximum number of retries.

Support. By setting specific criteria such as maximum number of connections and maximum number of requests.

Not supported yet.

Dynamic routing

Support. Dynamic routing of different versions of service requests is implemented by setting Linkerd’s DTAB rule.

Support. This is done by the version or environment information of the service.

Support. This is done by the version or environment information of the service.

Not supported yet.

Traffic diversion

Support. Triage is implemented in an incremental and controlled manner.

Support. Triage is implemented in an incremental and controlled manner.

Support. Triage is implemented in an incremental and controlled manner.

Not supported yet.

Service discovery

Support. Supports multiple service discovery mechanisms, such as file-based service discovery, Consul, Zookeeper, and Kubernetes.

Support. Integrate with different service discovery tools by providing platform-independent service discovery interfaces.

Support. Integrate with different service discovery tools by providing platform-independent service discovery interfaces.

Only Kubernetes is supported.

Load balancing

Support. Provides multiple load balancing algorithms.

Support. Provides a variety of load balancing algorithms, such as Round Robin, weighted minimum request, hash ring, Maglev, etc.

Support. Provides a variety of load balancing algorithms, such as Round Robin, weighted minimum request, hash ring, Maglev, etc.

Support. Currently, only HTTP requests support the load-balancing algorithm based on P2C + least-loaded.

Secure communication

Support TLS.

Support TLS.

Support TLS.

Support TLS.

Access control

Is not supported.

Is not supported.

Support. Rbac-based access control.

Not supported yet.

visibility

Distributed tracking (Zipkin), Runtime indicators (InfluxDB, Prometheus, StatSD)

Distributed Tracking (Zipkin), Runtime Metrics (STATSD)

Distributed tracking (Zipkin), runtime metrics (Prometheus, Statsd), Monitoring (NewRepic, Stackdriver)

Runtime metrics (Prometheus)

Deployment patterns

Sidecar or per-host mode

Sidecars mode

Sidecars mode

Sidecars mode

Control plane

Namerd

No, but it can be done through the API.

Pilot, Mixer, Citadel

Conduit

Protocol support

HTTP/1.x, HTTP/2, gRPC

HTTP/1.x, HTTP/2, gRPC, TCP

HTTP/1.x, HTTP/2, gRPC, TCP

HTTP/1.x, HTTP/2, gRPC, TCP

Operation platform

Platform independent

Platform independent

Kubernetes is currently supported, platform independent is the ultimate goal.

Only Kubernetes is supported.

Any of the above Service Mesh frameworks should meet your basic needs. Istio has by far the most functionality and flexibility of any of these service grid frameworks, and flexibility means complexity, and therefore requires more team preparation. If you just want to use basic Service Mesh governance, Linkerd is probably the best choice. Conduit may be your best choice if you want to support a heterogeneous environment with both Kubernetes and VM and do not need the complexity of Istio, which currently also offers support for a heterogeneous environment with both Kubernetes and VM.