In this article, we will focus on the basic concept of Consul and the internal principle of its implementation, as well as the comparison with Eureka.

1. What is Consul?

Consul is a service grid solution that provides a fully functional control plane with service discovery, configuration, and segmentation capabilities. Each of these functions can be used individually or together as needed to build a full-service grid. Consul requires a data plane and supports a broker and native integration model. Consul comes with a simple built-in proxy so everything works out of the box, but also supports third-party proxy integration such as Envoy. Key features Consul provides:

  • Service discovery: Consul’s clients can register for services, such as apis or mysql, and other clients can use Consul to discover providers for a given service. With DNS or HTTP, applications can easily find the services they depend on.
  • Health Check: Consul client can provide any number of health checks associated with a given service (” Yes Web server returns 200 OK “) or local node (” Memory utilization is less than 90% “). Operators can use this information to monitor cluster health and service discovery components can use this information to route traffic away from unhealthy hosts.
  • KV storage: Applications can use Consul’s hierarchical key/value store for any purpose, including dynamic configuration, feature marking, coordination, leader election, and more. The simple HTTP API makes it easy to use.
  • Secure service Communication: Consul can generate and distribute TLS certificates for services to establish mutual TLS connections. Intentions can be used to define which services are allowed to communicate. Rather than complex network topologies and static firewall rules, service fragmentation can be easily managed with intents that can be changed in real time.
  • Multi-data Center: Consul supports multiple data centers. This means that Consul’s users don’t have to worry about building additional layers of abstraction to scale to multiple regions.

#2. Consul’s architecture

Terms in 2.1 Consul

Before describing the architecture, we provide a glossary to help clarify what is being discussed:

  • Agent – An agent is a long-running daemon running on each member of Consul cluster. It is started by running consul Agent. The agent can run in client or server mode. Since all nodes must run agents, it is easier to call nodes clients or servers, but there are other instances of agents. All proxies can run DNS or HTTP interfaces and are responsible for running checks and keeping services in sync.

  • Client Agent – The client is the proxy that forwards all RPC calls to the server. The client is relatively stateless. The only background activity performed by the client is to participate in the LAN Gossip pool. This costs very, very little resources and consumes only a small amount of network bandwidth.

  • Server Agent – Servers are agents with extended responsibilities, including participating in Raft mediation, maintaining cluster state, responding to RPC queries, exchanging WAN Gossip with other data centers, and forwarding queries to leaders or remote data centers.

  • Datacenter – while the definition of a datacenter may seem obvious, you must consider the finer details. For example, in EC2, are multiple availability zones considered to contain a single data center? We define a data center as a dedicated, low latency and high bandwidth network environment. This excludes communication over the public Internet, but for our purposes, multiple available areas within a single EC2 area will be treated as part of a single data center.

  • Consensus – When used in our documentation, we use consensus to indicate agreement on elected leaders and agreement on order of transactions. Since these transactions are applied to finite state machines, our definition of consensus implies the consistency of a replicated state machine. Consensus is described in more detail on Wikipedia, and our implementation is described here.

  • Gossip – Consul is built on top of Serf and provides a complete Gossip protocol (the Gossip protocol) for a variety of purposes. Serfs provide member maintenance, fault detection, and event broadcasting. Our use of these is more described in the gossip documentation. Gossip participates in random node-to-node communication, mainly over UDP.

  • LAN Gossip – Refers to the LAN Gossip pool, which contains nodes located on the same LAN or data center.

  • WAN Gossip – Refers to a WAN Gossip pool that contains only servers. These servers are primarily located in different data centers and typically communicate over the Internet or wan.

  • Rpc-remote procedure call. This is a request/response mechanism that allows clients to make server requests.

2.2 10,000 feet check Consul

Within each data center, we have a mix of client and server. Three to five servers are expected. This strikes a balance between failure and availability of performance, as consensus slows down as more machines are added. However, there is no limit to the number of clients, which can easily scale to thousands or tens of thousands.

** All nodes in the data center participate in the gossip protocol. ** This means that there is a gossip pool that contains all the nodes in a given data center. This serves several purposes: first, there is no need to configure the server address for the client; Discovery is automated. Second, the job of detecting node failures is not on the server, but distributed. This makes fault detection more scalable than naive heartbeat schemes. Third, it is used as a messaging layer for notification of important events such as leadership elections.

The servers in each data center are part of a single set of Raft peers. This means they jointly elect a leader, a selected server with additional responsibilities. The leader is responsible for handling all queries and transactions. As part of the consensus agreement, transactions must also be replicated to all peers. Because of this requirement, when the non-leader server receives an RPC request, it forwards it to the cluster leader.

The server node runs as part of the WAN gossip pool. This pool differs from the LAN pool in that it is optimized for high Internet latency and is expected to contain only other Consul server nodes. The purpose of this pool is to allow data centers to discover each other in a low-touch manner. Creating a new data center online is as easy as joining an existing WAN gossip pool. Because servers are running in this pool, it also supports cross-data center requests. When the server receives a request for a different data center, it forwards it to a random server in the correct data center. The server can then be forwarded to the local leader.

This results in very low coupling between data centers, but requests across data centers are relatively fast and reliable due to fault detection, connection caching, and multiplexing.

Typically, data is not replicated between different Consul data centers. When a request is made for a resource in another data center, the local Consul server forwards the RPC request to the remote Consul server for that resource and returns a result. If the remote data center is unavailable, these resources will also be unavailable, but this will not affect the local data center. In special cases, a limited subset of data can be replicated, such as using Consul’s built-in ACL replication feature or an external tool like Consul – Replicate.

In some places, the client agent can cache data from the server to make it available locally to improve performance and reliability. Examples include connection certificates Connect Certificates and intentions, which allow client agents to make local decisions about inbound connection requests without round-trip to the server. Some API endpoints also support optional result caching. This helps improve reliability because the local agent can continue to respond to certain queries, such as service discovery or authorization from a cache connection, even if the connection to the server is down or the server is temporarily unavailable.

Gossip protocol and Raft protocol

To understand Consul, you don’t have to get around the Gossip protocol and Raft protocol.

3.1 Gossip Protocol

3.1.1 What is the Gossip protocol

Gossip algorithm as the name implies, inspired by office Gossip, as long as a person on a, in a limited period, all the people will know the Gossip of the information, this way is similar to the spread of the virus, so the Gossip has many aliases “Gossip algorithm” and “epidemic spreading algorithm”, “virus algorithm”, “rumor propagation algorithm. See this blog post for more

3.1.2 Gossip agreement in Consul

Consul uses two different gossip pools. We refer to each pool as a LAN or WAN pool.

Consul has a LAN gossip pool for each data center that contains all members of the data center, including clients and servers. LAN pools are used for a number of purposes. Membership information allows clients to automatically discover servers, reducing the amount of configuration required. Distributed fault detection allows fault detection to be shared across the cluster rather than centralized on a few servers. Finally, gossip pools allow for reliable and fast event broadcasts for events such as leadership elections.

The WAN pool is globally unique because all servers should participate in the WAN pool regardless of the data center. The membership information provided by the WAN pool allows the server to perform cross-data center requests. Integrated fault detection allows Consul to gracefully process an entire data center with a lost connection, or just a single server in a remote data center.

All of these capabilities are provided by leveraging serFs. It is used as an embedded library to provide these capabilities. From the user’s point of view, this doesn’t matter because the abstraction should be covered by Consul. However, as a developer, it’s useful to know how to leverage this library.

3.2 Raft agreement

Raft protocol, responsible for leader election and log synchronization. I recommend you check out this blog where you can also experience the principles of raft protocol online as a game through animations.

4. Comparison with Eureka

Because Eureka is part of Netflix’s OSS suite, 2.x is no longer maintained, but 1.x is still active.

Click here to see for yourself.

Eureka is a service discovery tool. The architecture is primarily client/server, with a set of Eureka servers per data center, usually one per availability area. Typically, Eureka’s customers use an embedded SDK to register and discover services. For clients that are not locally integrated, use the Ribbon equilateral car to transparently discover services through Eureka.

Eureka provides a weakly consistent view of services using best effort replication. When a client registers with a server, the server attempts to replicate to another server without guarantee. The service registration lifetime (TTL) is short and requires the client to perform a heartbeat check on the server. An unhealthy service or node stops heartbeat, causing them to time out and be removed from the registry. Discovery requests can be routed to any service that can provide outdated or missing data due to replication efforts. This simplified model enables easy cluster management and high scalability.

Consul offers a number of super features, including richer health checks, key/value storage and multi-data center awareness. Consul requires a set of servers in each data center, as well as agents on each client, similar to using sidecars like the Ribbon. Consul proxy allows most applications that are unaware of Consul to perform service registrations through configuration files as well as discovery through DNS or load balancer Sidecars.

Consul provides a strong consistency guarantee because the server uses the Raft protocol to replicate state. Consul supports rich health checks including TCP, HTTP, Nagios/Sensu compatible scripts or Ture based Eureka. Client nodes participate in gossip-based health checks that distribute health checks, unlike centralized heartbeats, which become scalability challenges. Discovery requests are routed to the elected consular leader, which allows them to be strongly consistent by default. Clients that allow stale reads allow any server to process their requests, allowing linear scalability like Eureka.

Consul’s strong consistency means it can be used as a lock service for leader election and cluster coordination. Eureka does not provide similar guarantees, and ZooKeeper is often required to run for services that require coordination or have stronger consistency requirements.

Consul provides the toolkit of features needed to support a service-oriented architecture. This includes service discovery, but also rich health checking, locking, key/value, multi-data center federation, event systems, and ACLs. Ecosystem tools such as Consul and Consul-Template and EnvConsul attempt to minimize the application changes required for integration to avoid the need for native integration through the SDK. Eureka is part of the larger Netflix OSS suite, which expects applications to be relatively homogeneous and tightly integrated. Therefore, Eureka solves only a limited part of the problem, with the expectation that other tools such as ZooKeeper can be used in parallel.

To summarize the above paragraphs:

  • Consul provides strong consistency through Raft protocol, while Eureka provides weak consistency.
  • Consul better distributes health check work via Gossip protocol instead of Eureka centralized heartbeat (where clients constantly request servers)
  • Consul can be used as a leader of choice, cluster protocol lock service, while Eureka can only resort to Zookeeper due to the strong consistency provided by Raft.

In addition, Spring Cloud to provide support for the Consul, official website is https://spring.io/projects/spring-cloud-consul, I will write back the use of concrete.

5. Reference:

www.consul.io/intro/index… www.consul.io/intro/vs/eu… www.consul.io/docs/intern… www.consul.io/docs/intern… www.consul.io/docs/intern… www.cnblogs.com/xingzc/p/61… www.cnblogs.com/xybaby/p/10…

The article will be sent to wechat as soon as possible. Please pay attention to my wechat public account, so that we can exchange and learn together