What lessons can we learn from the Redis practice of Weibo?

“This is the 30th day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

As we know, Redis is widely used in the internal business scenarios of Weibo, and a large amount of experience in application and optimization has been accumulated. An expert once shared a story about the optimization of Redis in weibo, which has a lot of excellent experience.

As the saying goes, “A stone from another mountain can attack a stone from another mountain.” Learning these lessons can help us better apply Redis in our own business scenarios. In today’s class, I will talk with you about the optimization of Redis by Weibo and my experience in combination with the sharing of weibo technical experts and the communication between me and their internal experts.

First of all, let’s take a look at the demand of Redis for weibo business scenarios. These business requirements are the starting point for Weibo to optimize and improve Redis.

There are many services of Weibo, such as the activity of letting red packets fly, the statistics of the number of fans, users and readers, information flow aggregation, music list, etc. At the same time, the user volume of these businesses is very large, and the amount of data stored and stored by Redis often reaches TB level.

As a direct-to-end-user application, the business experience of weibo users is very important, which requires technical support. Let’s summarize the technical requirements of Redis on Weibo:

Provides high-performance and concurrent read/write access, ensuring low read/write latency.
Can support large capacity storage;
Flexible expansion and rapid capacity expansion for different services.

In order to meet these requirements, Weibo has made a lot of improvements and optimizations to Redis. In general, it has not only improved the data structure and working mechanism of Redis itself, but also developed new functional components based on Redis, including RedRock that supports mass storage and RedisService that realizes servitization.

Next, let’s take a look at some improvements made by Weibo to Redis itself.

The basic improvement of Redis by Weibo

According to the share of weibo technical experts, we can find that Weibo’s basic improvements to Redis can be divided into two categories: blocking avoidance and memory saving.

First, they use a combination of full RDB and incremental AOF replication for persistence, which avoids data reliability or performance degradation. Of course, after the official 4.0 release, Redis also added the mechanism of mixing RDB and AOF.

Second, having extra BIO threads do the actual flushing when AOF logs are written to the flush avoids the problem of slow AOF logs blocking the main thread.

Thirdly, the aofNumber configuration item is added. This configuration item can be used to set the number of AOF files and control the total number of files when AOF writes disks, avoiding disk write problems caused by too many AOF log files.

Finally, in the master-slave replication mechanism, independent replication threads are used to synchronize the master and slave libraries to avoid the blocking effect on the main thread.

In terms of memory saving, a typical optimization of Weibo is to customize the data structure.

When using Redis to cache the user’s concern list, they customized the LongSet data type for the storage of the concern list. This data type is a collection of elements of type Long, and its underlying data structure is a Hash array. Before the design of the LongSet type, Weibo used Hash set type to store the list of users’ concerns. However, when storing a large amount of data, Hash set type consumes large memory space.

Furthermore, when the cached concern list is deprecated from Redis, the cache instance needs to read the user concern list from the background database and write the Hash collection using HMSET, which can degrade cache performance in the case of high concurrent request pressure. Compared with Hash sets, the underlying type of LongSet uses Hash arrays to store data, which not only avoids the pointer overhead of Hash tables, saves memory space, but also realizes fast access.

From the improvements just introduced, you can see that the starting point of Weibo’s Redis optimization is the same as the Redis optimization goal that we repeatedly emphasized in the previous course. I’ve learned two lessons myself.

The first lesson is that high performance and savings are always the focus of Redis application, which is closely related to the position of Redis in the whole business system.

Redis is typically deployed as a caching front end to the database layer and needs to be able to return results quickly. In addition, Redis uses memory to store data, which on the one hand brings the advantage of fast access, on the other hand, also requires us to pay special attention to memory optimization in operation and maintenance. I’ve covered a lot about performance optimization and memory saving in previous lectures (lectures 18 ~ 20, for example), so you can review it and really put it into practice.

The second experience is that in practical applications, Redis is needed to do customized work or secondary development to meet the needs of some special scenarios, such as microblog customized data structure. However, if you want to customize or redevelop, you need to know and understand the Redis source code. So, I recommend reading the Redis source code as your next goal once you’ve mastered the fundamentals and key technologies of Redis. This way, you can use the principles to enhance your understanding of the source code, and once you have the source code, you can start developing new features or data types. I showed you how to add new data types in Redis in lecture 13, so you can review it again.

In addition to these improvements, in order to meet the needs of large storage capacity, weibo experts also mentioned in the technical share that they are using RocksDB in combination with hard disk to expand the capacity of single instances, let’s take a look.

How does Weibo cope with the demand for large data storage?

The data to be stored in weibo business layer often reaches TB level, which needs to expand the storage capacity of Redis instance.

To meet this requirement, Weibo distinguishes between hot and cold data, reserves hot data in Redis, and writes cold data to the underlying hard disk through RocksDB.

Hot and cold data is quite common in the business scenario of Weibo. For example, some weibo topics are very hot when they just happen, and a large number of users will visit these topics. Therefore, it is necessary to use Redis service for user requests.

However, when the hot topic passes off, the number of visitors drops sharply and these numbers become cold. At this point, the cold data can be migrated from Redis to RocksDB and stored on hard disk. In this way, the memory of the Redis instance can be saved to hold hot data, and the amount of data a single instance can hold is determined by the size of the entire hard disk.

According to the technical sharing on Weibo, I drew an architecture diagram of their capacity expansion by using RocksDB to assist Redis:

As you can see in the figure, Redis uses asynchronous threads to read and write data in RocksDB.

After all, the latency of reading and writing to RocksDB is inferior to the memory access latency of Redis. This is also to avoid blocking the Redis main thread when reading or writing cold data. RocksDB is responsible for the layout and management of cold data on SSDS. RocksDB is now mature and stable, and is competent for Redis cold data management.

I have also summed up two experiences about the capacity expansion optimization of Weibo using RocksDB and SSD, which I would like to share with you.

First of all, there is a need in some business scenarios to implement high-volume single instances. Although we can use multiple instances of sliced clusters to decentralize data storage, this approach also incurs cluster operation and maintenance overhead, which involves the management and maintenance of distributed systems. Moreover, the size of a sliced cluster is limited, and if you can increase the storage capacity of a single instance, the cluster can hold more data even when smaller clusters are used.

The second lesson is that if you want to implement high-volume Redis instances, using SSD and RocksDB is a good solution. The 360 open source Pika we learned in Lecture 28, as well as the practice of Weibo, are very good references.

RocksDB can write data quickly, use memory to cache some data, and provide ten thousand level of data read performance. In addition, the performance of SSDS improves rapidly. The disk IOPS of a single SSD can reach hundreds of thousands. Together, these technologies enable Redis to provide large data storage capacity while maintaining read and write performance. When you have the same needs, you can also use SSD based RocksDB to store large volumes of data.

Facing multiple business lines, how does Weibo service Redis?

Different services of Weibo have different requirements for Redis capacity, and may have requirements for expansion and reduction with the change of services.

In order to flexibly support these business requirements, Weibo RedisService Redis. The so-called servitization refers to the use of Redis cluster to serve the needs of different business scenarios. Each business has independent resources and does not interfere with each other.

At the same time, all Redis instances form a resource pool and the resource pool itself can be easily expanded. If a new service comes online or an old service goes offline, you can apply for resources from the resource pool or return unused resources to the resource pool.

With the Redis service in place, it becomes very convenient for different lines of business to use Redis. There is no need for business departments to deploy and operate independently, as long as business application clients access Redis service cluster. Even if the data volume of service applications increases, there is no need to worry about instance capacity. The service cluster itself can be automatically expanded online to support service development.

In the process of Redis servitization, Weibo adopts a scheme similar to Codis to connect the client and server through the cluster agent layer. From the open technical data of Weibo, we can see that they have realized rich servitization function support in the agent layer.

Client connection listening and automatic port addition and deletion.
Redis protocol parsing: Determine the request that needs routing, and return an error if it is an invalid or unsupported request.
Request routing: Based on the mapping rules between data and back-end instances, the request is routed to the corresponding back-end instance for processing and the result is returned to the client.
Indicator collection and monitoring: Collects cluster running status and sends it to specialized visual components for monitoring and processing.

In addition, within a servified cluster, there is a configuration center that manages metadata for the entire cluster. Meanwhile, instances run in master/slave mode to ensure data reliability. Data of different services is deployed on different instances and isolated from each other.

According to my understanding, I have drawn a schematic diagram showing the structure of Redis service cluster of Weibo. You can have a look.

From the practice of Redis servitization, we know that providing platform-level services is a common practice when multiple lines of business have common Redis usage requirements, that is, servitization.

When turning a common function into a platform service, we need to focus on issues such as smooth platform expansion, multi-tenant support and business data isolation, flexible routing rules, and rich monitoring functions.

If we want to expand the platform, we can use Codis or Redis Cluster to achieve it. Multi-tenant support is consistent with the requirements of service isolation. Resource isolation is required to implement the two requirements, that is, data of different tenants or services is deployed separately to avoid resource mixing. For routing rules and monitoring functions, weibo’s current scheme is good, that is, to complete these two functions in the proxy layer.

Only when these functions are well implemented can a platform service efficiently support the needs of different lines of business.

summary

In today’s lesson, we learned the Redis practice of Weibo, from which we summarized many experiences. To sum up, weibo’s technical requirements for Redis can be summarized into three points, namely, high performance, large capacity and easy expansion.

In order to meet these needs, in addition to optimizing Redis, Weibo is also developing its own expansion system, including capacity expansion mechanism based on RocksDB and servitization RedisService cluster.

Finally, I want to share with you two more feelings of my own.

The first one is about the RedisService cluster made by Weibo. This optimization direction is the main work direction of students in dachang platform department.

Business – cutting and platform – cutting are the basic ideas of constructing large-scale system. Service cutting refers to the deployment of different service data separately to avoid interference with each other. Platform crosscutting means that when different business lines have the same requirements for the operating platform, they can be unified and supported by building platform-level cluster services. Redis is a typical basic service required by multiple business lines. Therefore, it is helpful to improve the overall efficiency of the business by serving it in a cluster manner.

The second is the importance of code practice in the process of growing into a Redis expert.

I find that the secondary transformation or development of Redis is a necessary way for Dachang, which is closely related to the large business and wide demand of Dachang.

The custom data structures made by Weibo, RedRock and RedisService are all typical examples. Therefore, if we want to become a master of Redis, become a member of the big factory, then, the first principle after the code, while learning while practicing, is a good method. While principles are used to guide the focus of code reading, hands-on practices are critical, requiring both deployment practices and code reading practices. On paper come zhongjue shallow, must know this matter to practice, I hope you not only pay attention to the learning principle, but also really use the principle to guide practice, improve their actual combat ability.

Each lesson asking

As usual, let me ask you a quick question, do you have any classic optimization or secondary development experience when you actually use Redis?

Welcome you to talk about your experience in the comment area, we exchange and discuss together. If you find today’s content helpful, you are welcome to share it with your friends and colleagues.

What lessons can we learn from the Redis practice of Weibo?

The basic improvement of Redis by Weibo

How does Weibo cope with the demand for large data storage?

Facing multiple business lines, how does Weibo service Redis?

summary

Each lesson asking

Related Posts

Java Development Manual highlights

Learn the basics of SQL queries

Mysql new version (8+) login permission problem, solution