I. Interpretation of distributed structure of Redis

First of all, redis uses decentralized design and this understanding is not in place. Redis distributed mode, with master-slave and cluster two, redis community cluster scheme Redis Cluster is a decentralized design. Let’s take a look at the evolution of Redis:

The figure above shows the standard Redis master-slave mode. Only the Master receives the write request and copies the data to one or more slaves. This forms a good read-write separation mechanism, and multiple slaves can share the read operation. So Redis master-slave is the standard distributed centralization idea.

The application scenarios of Redis are mostly extremely concurrent memory I/O, so in the master-slave mode shown in the figure above, the Master is not only responsible for writing, but also for internal replication of each node. The resource consumption of the Master is very large, and with the more slave nodes, this problem becomes more obvious. Therefore, Redis also forms a variant form of master-slave:

The figure above is a tree structure of the Redis master-slave topology. The advantage of this topology is that the Master does not need to replicate data to countless slave nodes, but to the slave nodes in the next layer. This removes as much of the Master’s work as possible from duplication.

The problem is that in high-concurrency business scenarios like this, the Master is always a concern because it takes all the writes and if it crashes, the cluster as a whole becomes unavailable without an HA solution. Therefore, the Redis community’s clustering solution, which is to solve the main pressure, naturally takes into account the distributed, centralless mode of using clusters.

In the figure above, the cluster’s central mode is on the left, and the redis Cluster’s central-less mode is on the right.

Redis Cluster: Redis center USES the concept of virtual channel, it is independent of the physical nodes, often easily put this confusion, virtual slot with 0 ~ 16383, the redis data, the hash key (specific formula online a lot), determining the data which is into the slot, and physical node is responsible for which virtual slot, which is specified by us.

Such as: When the data of 1 G is written to the Redis cluster according to a series of records with key, each node of the cluster will calculate which slot and which node this record should belong to as long as it receives the data. The node that belongs to the cluster will write the data mapped to the slot, and the node that does not belong to itself will feedback the node that the client really needs to write. The client then initiates a second request to the owning node of the record, which completes the sharding of 1 G of data in the cluster.

Regardless of the advantages and disadvantages of Redis cluster, we can see from the above evolution that the process of redis’ master-slave structure to cluster evolution is actually a decentralized process, which is to make the concurrent performance of multi-client and multi-service requests get better load. In addition, each node can be deployed in master/slave mode to ensure high reliability of HA. Even if the master node fails, Salve will take over. The downside of HA is that the number of nodes doubles again.

The biggest difference between Redis and RocketMQ is that Redis is more focused on the high concurrency of online business, which is the heavy throughput reception and consumption of massive backlogged data streams. Therefore, the purpose of choosing distributed architecture is different. Of course, this does not mean that the centralization is not suitable for high concurrency. For example, the OceanBase represented by LSM-Tree is characterized by centralized processing, which can well achieve high concurrent write of online services and high speed hotspot data (recent time) search.

In addition, as each node of Redis cluster is equal in the distributed system, there will be consistency risk in cluster management. As various abnormal situations are very special in the production environment, the recognition status of different nodes to the cluster will be inconsistent. Therefore, manual intervention to adjust the status of each node in the cluster will increase.

Distributed interpretation of Kafka and RocketMQ

Let’s take a look at Kafka, better known than RocketMQ, and explain the distributed nature of a Kafka cluster.

The cluster management in Kafka comes from the ZooKeeper cluster. The Broker registry finds that the election of the Broker Controller is assisted by ZooKeeper, but the Controller does not actually do anything during message processing. Just take the lead on other nodes in creating partitions, rebalancing partitions, etc.

Kafka really works on the leader and follower. For example, if a topic is divided into four partitions and three copies, there will be a total of four * three =12 partition copies. If there are four brokers, each broker will be evenly distributed across three partitions with one master and two slaves.

The producer (Product) obtains Meta information from any node, finds the leader partition copy of the broker, and writes the assigned data to the leader partition copy. The leader makes a copy of the data to the follower partition copy of the other broker in the cluster.

With this partitioning structure, each broker has the Master capability to request access to the topic partition data and replicate it. So you ask me if Kafka is the central mode, and we’ll talk about that.

Let’s look at Rocketmq, Kafka’s Ali brother

The RocketMQ architecture no longer uses the ZooKeeper cluster as a service registry discovery

The RocketMQ Queue mode is very similar to Kafka in many ways, but has its own characteristics. It is more suitable for high-concurrency, more topic, and more sequential business message processing. For example, the topic is divided into multiple fragments, and the partition is divided into multiple queues. Each Queue can only correspond to one consumer to achieve more concurrent consumer load balancing. I won’t bore you with the details. Let’s focus on the distributed nature of RocketMQ.

A NameServer is a registry of brokers. A newly registered broker or a broker that exits abnormally is reported to the corresponding NameSever. Nameservers share information by locking the registry. NameServer adds/deletes brokers to the registry and periodically reads the latest cluster broker information.

Producers connect to a NameServer and get access to brokers who send partition data. Consumers do the same. The process of sending consumption is similar to Kafka, but it is more detailed to the queue level of a topic.

Another feature of RocketMQ is that each Borker can be subdivided into master-slave modes, with the master performing queue operations and the slave only doing data synchronization and waiting for the master to be replaced if the master fails.

Rocketmq’s Namesever has a simpler structure than ZooKeeper’s, and producer and consumer access to brokers and partitions must come from NamesEver, although the Namesever cluster itself is centralless. The entire RocketMQ brokers is managed by Namesever, but the overall product, consumer, and brokers clusters don’t rely on this centralised management as much as they do on simple broker meta-information services. The actual data flow is left to the brokers themselves.

The broker partition information is distributed in the meta cache of each broker. Producers and consumers can access the leader partition information on any Borker. Kafka is a bit decentralized. However, the meta cache information is actually from ZooKeeper, and ZooKeeper must be relied on, so Kafka is still centrally managed in essence.

Oceanbase Distributed architecture

Oceanbase is a typical implementation of lSM-Tree. For LSM-Tree, you can see my other answer to TiDB article, which mainly describes the characteristics of RocksDB LSM-Tree: Why do distributed databases like to use KV Store so much?

As the architecture of OceanBase, I will not say too much this time. I just want to make a brief summary. Oceanbase architecture subtly integrates the idea of Lambda architecture, but it is different from the focus of Lambda architecture.

The OceanBase rootServer and updateServer are deployed on the same node, serving as the distributed center.

RootServer Manages the cluster.

UpdateServer is used for incremental data updates, and tries to do increments in memory to form the most efficient query for recent incremental data, usually the current day data.

ChunkServer is used for the baseline data store, which is usually alternate day historical data.

MergeServer, which accepts SQL from the client for interpretation, and merges data between updateServer query results and query results of different chunkServer nodes, usually the incremental data of the same day and the historical data of the next day.

This is very similar to the speed layer, batch layer, service layer idea of the Lambda architecture. When the customer initiates a query statistics request, the updateServer satisfies the real-time query statistics of the incremental data of the day. The chunkServer node provides the distributed query of the baseline data. Finally, mergeServer merges the results of the updateServer and the baseline results of each chunkServer. Feedback to the client, oceanBase architecture design is a work of art.

conclusion

This article mainly introduces the decentralized management of Redis Cluster, the architecture features of kafka and RocketMQ, and some architecture features of OceanBase by the way.

While the message queue architecture relies very lightly on centralised patterns and RocketMQ simply uses Nameserver for broker registry discovery, I think Kafka can do away with ZooKeeper in the future and design registry and discovery in a more decentralized way.

In contrast, the most mature scheme of Redis is still master-slave. The performance advantages brought by Redis cluster cannot offset the immature and unreliable problems brought by decentralization, resulting in the complexity and difficulty of human and industrial operation and maintenance. Use redis cluster with caution!

The architecture of OceanBase is elegant and artistic. Take time to understand the practice and write an article. Oceanbase is similar to Google’s Bigtable and Hadoop’s Hbase, but it incorporates the idea of Lambda architecture. Make the system more in line with the actual needs, but also more flexible and reliable. However, clusters require a lot of resources.

I am the creator of “Read Byte”, a deep dive into big data technology and interpretation of distributed architecture

Head over to Read Byte zhihu — learn more about big data

Public account “read byte” distributed, big data, depth of software architecture, professional interpretation