In order to improve the quality of the content, this article has been revised and changed.

1, write in front

This paper will analyze and summarize the knowledge points that need to be mastered to realize this system under the technical premise of 100-million-level message volume and distributed IM system. There are no profound technical concepts in the content, and try to do that both novice and old hands can understand it.

This article will not provide a general set of IM solutions, nor will it judge a particular architecture, but rather discuss common challenges to designing IM systems and industry solutions.

Because there is no such thing as a universal IM architecture solution, different solutions have their own advantages and disadvantages, only the system that most meets the business is a good system.

In the context of limited human, material, and time resources, there are often many trade-offs to be made. In this case, an IM system that supports rapid iteration and easy expansion is the optimal solution.

Learning and communication:

  • Im/Push Technology Development Exchange 5 groups: 215477170 [Recommended]
  • Introduction to Mobile IM Development: One Entry is Enough: Developing Mobile IM from Zero
  • Open source IM framework source: https://github.com/JackJiang2011/MobileIMSDK

2. Related articles

Similar to this article, the following two articles are also suitable for reading together, and you can learn them together if you are interested.

  • A set of IM Architecture Technology for 100 million Users (Part I) : Overall Architecture, Service Separation, etc.
  • A set of IM Architecture technology for 100 million users (Part II) : Reliability, Order, weak network Optimization, etc.

3, IM common terms

_0) _ User: User of the system.

1) _ Message: Refers to the content of communication between users. In AN IM system, messages can be classified into the following types: text messages, emoticons, picture messages, video messages, file messages, etc.

2) _ Session: Usually refers to the association between two users established by chatting.

3) _ Group: Usually refers to the association established between multiple users due to chat.

4) _ Terminal: refers to the machine on which the user uses the IM system (usually Android, iOS, Web, etc.).

_5) _ Unread: Indicates the number of messages that have not been read by the user.

_6) _ User Status: Indicates whether the user is online, offline, or suspended.

_7) _ Relationship chain: refers to the relationship between users, usually have one-way friend relationship, two-way friend relationship, follow relationship, etc. (Here need to note the difference with the conversation: the user only initiates the chat to create a conversation, but the relationship does not need to chat to establish. For the storage of chains of relationships, you can use a graph database (Neo4j, etc.) that naturally expresses real-world relationships and is easy to model).

_8) _ Single chat: one-on-one chat.

_9) _ Group chat: chat with many people.

_10) _ Customer service: In the field of e-commerce, it is usually necessary to provide users with pre-sales consultation, after-sales consultation and other services (at this time, it is necessary to introduce customer service to deal with users’ consultation).

_11) _ Message distribution: In the field of e-commerce, a store usually has more than one customer service. At this time, the decision of which customer service to handle the user’s consultation is message distribution. (Usually, the message distribution will determine which customer service to send the message according to a series of rules. For example, whether the customer service is online (if the customer service is not online, it needs to be redistributed to another customer service), whether the message is pre-sales consultation or after-sales consultation, how busy the current customer service is, etc.).

_12) _ Mailbox: The mailbox in this article refers to a Timeline and a queue for sending and receiving messages.

4. Read diffusion vs write diffusion

Read diffusion and write diffusion are two technical concepts that are often involved in IM systems. Let’s take a look.

4.1 read the diffusion

As shown in the picture above, A has A mailbox for each chat person and group (some blog posts are called Timeline, see discussion on The Synchronization and Storage Scheme of Chat Messages in Modern IM System), and A needs to read all the mailbox with new messages when viewing chat messages.

Note the difference from the Feeds system: in the Feeds system, each person has a mailbox, and the writing needs to be written only once into their own mailbox, while the reading needs to be read from everyone’s mailbox. But read diffusion in IM systems is usually one mailbox per two related people, or one mailbox per group.

The advantages of reading diffusion:

  • 1) Write operation (send message) is very light, no matter single chat or group chat, only need to write to the corresponding mailbox once;
  • 2) Each mailbox is naturally two people chat record, can be convenient to view chat record and chat record search.

Disadvantages of read diffusion: The read operation (read message) is heavy. In complex services, a read diffusion message source needs complex logic to be propagated to the target message.

4.2 write diffusion

Now let’s look at writing diffusion.

As the picture above shows, in write diffusion, each person reads messages only from their own mailbox.

But when writing (sending) messages, for single and group chats, the treatment is as follows:

  • 1) Single chat: write a message to both your mailbox and each other’s mailbox. At the same time, if you need to check the chat history of two people, you need to write a second message (of course, if you can trace all the chat records of two people from your personal email, but this will be very inefficient);
  • 2) Group chat: You need to write a message to the mailbox of all group members. At the same time, you need to write another message if you need to check the chat history of the group. As you can see, write diffusion greatly magnifies write operations for group chats.

PS: In fact, the spread of messages in group chat is a technical pain point in IM development, interested in suggestions to read: “summary of the implementation of IM group chat technology”.

Write diffusion advantages:

  • 1) Read operation is very light;
  • 2) Can be very convenient to do the message of multi-terminal synchronization.

Write diffusion disadvantages: Write operations are heavy, especially in group chats (because if the group has a large number of members, one source will have to be spread as “number of members -1” target message, which is scary).

In the Feeds system:

  • 1) Write diffusion is also called Push, fan-out, or write-fanout;
  • 2) Read diffusion is also called: Pull, fan-in, or read-fanout.

5, the technical solution of unique ID

5.1 Basic Knowledge

In general, ID design falls into the following categories:

  • 1) UUID;
  • 2) ID generation method based on Snowflake algorithm;
  • 3) Generation mode of step size based on application DB;
  • 4) Redis or DB-based auto-increment ID generation mode;
  • 5) Special rules generate unique IDS.
  • . .

Specific implementation methods and advantages and disadvantages can refer to the following IM message ID feature article:


IM Message ID Technology (1) : Wechat Mass IM Chat message serial number generation practice (algorithm principle)”



IM Message ID Technology Topic (2) : Wechat Mass IM Chat Message serial number generation Practice (Disaster Solution)”



IM Message ID technology topic (3) : Decrypt the chat message ID generation strategy of Rongyun IM product”



IM Message ID Technology (iv) : Deep decryption of Distributed ID generation algorithm of Meituan”



IM Message ID technology topic (5) : open source distributed ID generator UidGenerator technical implementation”



IM Message ID Technology (6) : High-performance ID Generator (Tinyid)”

The main places where unique ids are required in IM systems are:

  • 1) Chat session ID;
  • 2) ID of the chat message.

5.2 the message ID

Let’s look at three issues to consider when designing message ids.

5.2.1Can the message ID not be incremented?

Let’s see what happens if we don’t increase:

  • 1) Use string, waste storage space, and can not use the characteristics of the storage engine to store adjacent messages together, reduce the message write and read performance;
  • 2) Use numbers, but the numbers are random, and can not take advantage of the characteristics of the storage engine to store adjacent messages together, which will increase random I/O and reduce performance; And random IDS don’t guarantee the uniqueness of the IDS.

Therefore, the message ID is best incremented.

5.2.3 requires) Global increment vs user level increment vs session level increment:

Globally incrementing: Message IDS are incrementing over time throughout the IM system. Snowflake is usually used for global increments (of course, Snowflake is only incrementing the worker level). At this point, if your system is read diffusion, in order to prevent message loss, then each message can only carry the ID of the last message, the front end based on the last message to determine whether there is a lost message, if there is a lost message needs to pull again.

Increasing user level: The message ID is guaranteed to be increasing only for a single user. It does not affect different users and may be repeated. Typical representative: wechat (see “Wechat mass IM Chat Message serial number generation practice (algorithm principle)”). For a write diffusion system, the mailbox timeline ID and message ID need to be designed separately. The mailbox timeline ID is incrementally assigned by the user level, and the message ID is incrementally assigned globally. If it is a read-diffusion system, the need to use user level increment is not felt as great.

Session level increment: The message ID is guaranteed to be increased only in a single session. It does not affect different sessions and may be repeated. Typical representative: QQ.

5.2.3 requires) Continuous increasing vs monotone increasing:

In continuous increment mode, press _1,2,3… N_; Monotonically increasing means that you just need to make sure that the ids that are generated are larger than the ids that are generated before, so you don’t need to be continuous.

As far as I know: QQ’s message ID is the continuous increment used at the session level. The advantage of this is that if a message is lost, when the next message comes, it finds that the ID is not continuous, it will request the server to avoid losing the message.

At this point, you might be thinking, can’t I just time pull to see if any messages are missing? Of course not, because if a person has thousands of sessions, how many times does it have to be pulled? The server is not going to be able to resist.

For read diffusion, it is a good idea to use sequential increments of message ids. If you use monotonic incrementation, the current message needs to carry the ID of the previous message (that is, chat messages form a linked list), so that you can determine whether the message is lost.

5.2.4) To summarize:

Write diffusion: The mailbox timeline ID is incrementing by user level, and the message ID is incrementing globally, as long as it is monotonically incrementing.

Read diffusion: Message ids can be incremented using the session level and preferably sequentially.

5.3 the session ID

Let’s look at some of the issues that need to be addressed when designing session ids.

There is an easy way to generate the session ID. A special rule generates a unique ID: from_user_id is spliced with to_user_id.

The concatenation logic can look like this:

  • If both from_user_id and to_user_id are 32-bit integer data, it is easy to concatenate the from_user_id and to_user_id into a 64-bit session ID. _conversation_id = ${from_user_id} < < 32 | ${to_user_id} _ (before joining together to ensure that value is from_user_id smaller user ID, so that any two user initiated session can easily know the session ID);
  • If from_user_id and to_user_id are 64 bits, they can only be concatenated as a string. If from_user_id and to_user_id are 64 bits, they can only be concatenated as a string.

The former employer used the first method above, but there is a problem with the first method: as the business expands around the world, 32-bit user ids will need to be drastically changed if they are not enough and need to be expanded to 64-bit. The 32-bit plastic ID looks like it could hold 2.1 billion users, but usually we don’t use consecutive ids to prevent people from knowing the real user data, so the 32-bit user ID is completely inadequate. The sheer reliance on user IDS is not a desirable design approach.

Therefore, session ids can be designed to increase globally with a mapping table: from_user_ID, to_user_ID, and conversation_ID.

6, New information “push mode vs Pull mode vs push and pull combination mode”

In AN IM system, there are three possible ways to obtain new messages:

  • 1) Push mode: When there is a new message, the server will proactively push it to all terminals (iOS, Android, PC, etc.);
  • 2) Pull mode: The front end initiates the request to pull the message. In order to ensure the real-time performance of the message, the push mode is generally adopted, and the pull mode is generally used to obtain the historical message.
  • 3) Combination of push and pull mode: When there is a new message, the server will first push a notification of the new message to the front end, and the front end will pull the message to the server after receiving the notification.

The simplified figure of push mode is as follows:

As shown in the preceding figure: In normal cases, messages sent by users are stored by the server and pushed to all ends of the receiving end.

But pushed the is may be lost, the most common case is the user may be fake online (means if push service based on the long connection, while long connection may have been disconnected, namely the user has dropped, but usually after a cardiac cycle server is needed to perceive that the server will be wrongly assume that users will also online; Pseudo online is a concept I came up with, didn’t think of the right word to explain). Therefore, it is possible to lose messages if you simply use the push mode.

PS: Why is the problem of “pseudo-online” as described by the author? Read “Why does TCP based mobile IM still need heartbeat preservation?” .

The simplified figure of push-pull combination mode is as follows:

You can use the combination of push and pull mode to solve the problem that the push mode may lose messages: that is, when the user sends a new message, the server pushes a notification, and then the front end requests the latest message list. In order to prevent the loss of messages, you can take the initiative to request once every other time. As you can see, the best way to use the push-pull combination is to use write diffusion, because write diffusion only needs to pull one timeline of individual mailboxes, while read diffusion has N timelines (one per mailbox), and the performance will be poor if it is also regularly pulled.

7, industry IM solution reference

Now that we’ve seen some common design problems with IM systems, let’s take a look at how the industry is designing IM systems.

Studying the mainstream solutions in the industry can help us gain insight into the design of IM systems. The following studies are based on information already available on the Internet, which may not be correct.

7.1 WeChat

Although a lot of the basic framework of wechat is self-developed, but this does not prevent us from understanding the architecture design of wechat.

It can be seen from the article “Fast Fission: Witness the Evolution of the Powerful Background Structure of wechat from 0 to 1 (2)” that wechat adopts the combination of writing diffusion + push and pull. Because group chat is also using writing diffusion, and writing diffusion is very resource consuming, so wechat group has a maximum number of people (currently 500). So this is also an obvious disadvantage of the proliferation of writing, if the need for tens of thousands of people is more difficult.

It can also be seen from the article that wechat has adopted a multi-data center architecture:

▲ The picture is quoted from “Fast Fission: Witness the Evolution of wechat’s Powerful Background Architecture from 0 to 1 (2)”

Each data center of wechat is autonomous, and each data center has a full amount of data. Data centers synchronize data through self-developed message queues.

To ensure data consistency, each user belongs to only one data center and can only read and write data in its own data center. If a user is connected to another data center, the user is automatically connected to the data center to which the user belongs. If you need to access other users’ data, you only need to access your own data center.

Meanwhile, wechat uses the three-park disaster recovery architecture and Paxos to ensure the consistency of data.

From the wechat public “Wechat mass IM Chat message serial number generation Practice (Disaster Recovery Plan)” this article can be seen, wechat ID design is based on the APPLICATION DB step generation mode + user level increase.

As shown in the figure below:

▲ Picture quoted from “Wechat massive IM Chat message serial number generation Practice (Disaster Solution)”

Wechat’s serial number generator generates a routing table (which holds a full mapping of UID number segments to AllocSvr) from the mediation service, and the routing table is synchronized to AllocSvr and the Client. If an AllocSvr goes down, the quorum service will reschedule the UID segment to another AllocSvr.

PS: Wechat team shared a lot of technical data, interested can see “QQ, wechat technology sharing – summary”.

7.2 nailing

Nail public information is not much, from the “Ali nail technology sharing: enterprise IM king – nail on the back end of the architecture of the extraordinary” this article we can only know, nail was first used to write the diffusion model, in order to support thousands of people, then seems to be optimized to read the diffusion.

But when we talk about Ali’s IM system, we have to mention the Tablestore developed by Ali itself: In general, THE IM system will have an auto-increment ID generation system, but Tablestore creatively introduced the primary key column auto-increment, that is, the generation of ID integration into the DB layer, support user level increment (traditional MySQL and other DB can only support table level auto-increment, that is, global auto-increment), specific can refer to: How to optimize a highly concurrent IM system architecture.

PS: The nail team disclosed very little technology, this is another article: “Nail – based on IM technology of the new generation of enterprise OA platform technical challenges (video +PPT)”, interested can study.

7.3 the Twitter

What? Isn’t Twitter a Feeds system? Isn’t this article about IM?

Yes, Twitter is a Feeds system, but there are a lot of design similarities between Feeds and IM systems, so looking at Feeds will help you design your IM system. Besides, it doesn’t hurt to explore the Feeds system to broaden your technical horizons.

Twitter’s self-incrementing ID design is probably familiar, known as Snowflake, so the ID is incrementing globally.

As you can see from this video on How We Learned to Stop Worrying and Love Fan-in at Twitter, Twitter started out using the write diffusion model, The Fanout Service is responsible for proliferating writes to the Timelines Cache (using Redis), and the Timeline Service is responsible for reading the Timeline data, which is then returned to the user by API Services.

However, since writing diffusion consumes too much for big V, Twitter used the combination of writing diffusion and reading diffusion later.

As shown in the figure below:

If users with few followers still use the write diffusion model when sending Twitter, the Timeline Mixer service will combine the user’s Timeline, big V’s write Timeline, system recommendation and other content, and finally return it to the user through API Services.

7.4 58 home

58 Home implements a universal real-time messaging platform:

▲ Picture quoted from “58 Home Real-time Message System architecture design and technology selection experience summary”

It can be seen that: Msg-server saves the correspondence between applications and MQ topics. Msg-server pushes messages to different MQ queues according to this configuration, and specific applications can consume them. Therefore, adding an application requires only a configuration change.

In order to ensure the reliability of message delivery, a confirmation mechanism is also introduced: the message platform first lands the message in the database, and the application layer ACK deletes the message after the receiver receives the message. It is better to use the acknowledgement mechanism only single sign-on, if multiple sides can log in at the same time, it is more troublesome, because all sides need to confirm the receipt of the message before deleting.

PS: 58 Home platform department director Ren Taoshu also shared “58 home real-time messaging system protocol design and other technical practices to share” article, interested can read together.

By now, you’ve probably got the idea that designing an IM system can be challenging. Let’s move on to the issues that need to be considered when designing an IM system.

7.5 Other Industry solutions

IM network also includes a large number of other industry IM or IM system design solutions, limited to the length of the reason here is not a list, you can be interested in selective reading, the following is a summary of the article.


A set of mobile IM architecture design practice sharing for massive online users (with detailed pictures and pictures)”



A set of original distributed instant messaging (IM) system theory architecture scheme”



From Zero to Excellence: The evolution of the technical architecture of JD customer Service INSTANT messaging system”



Architecture choice for mogujie IM /IM server development”



Discussion on synchronization and storage scheme of chat message in modern IM system”



WhatsApp Technology Practice sharing: technology myth created by 32 people engineering team”



Summary of technical challenges and practices behind 100 billion visits to wechat Moments”



Taking microblog application scenarios as an example, the architecture design steps of massive social systems are summarized”



Bullet Behind the bright SMS: Chief architect of netease Yunxin shares the technology practice of 100-million-level IM platform”



A set of highly available, scalable, high concurrency IM group chat, single chat architecture design practices”



From guerrilla to Regular Army (I) : The evolution of THE IM system architecture of hornet tourism network”



From guerrilla to Regular Army (III) : Technology practice of DISTRIBUTED IM system based on Go for hornet travel network”



Data architecture design of Guazi IM intelligent customer service system (organized from the on-site speech, with supporting PPT)”



Alibaba Technology sharing: E-commerce IM messaging platform, technology practice in group chat and live broadcast scenarios”



A set of 100 million users of IM architecture technology dry goods (part I) : overall architecture, service separation, etc”

8. Technical pain points that IM needs to address

8.1 How Can I Ensure the Real-time Performance of Messages

PS: If you don’t know what real-time messaging in IM is, be sure to read the article “Getting Started with Zero-based IM Development ii: What is Real-time for IM Systems?” ;

In terms of communication protocol selection, we have the following choices:

  • 1) Use TCP Socket communication, design your own protocol: 58 home and so on;
  • 2) Use UDP Socket communication: QQ, etc. (see “Why QQ uses UDP protocol instead of TCP protocol?”) );
  • 3) Use HTTP long round robin: wechat web version, etc.

Either way, we can achieve real-time notification of our messages, but what may affect the timeliness of our messages is the way we process them.

For example, if we use MQ to process and push a message of 10,000 people when we push, it takes 2ms to push one person, then it takes 20s to push 10,000 people, then the following messages will be blocked for 20s. If we need to push within 10ms, then the concurrency of our push should be: _ Number of people: 10000 / (total push duration: 10 / single person push duration: 2) = 2000_.

Therefore: we must evaluate the throughput of our system when choosing a specific implementation scheme, and every link of the system should be evaluated and tested. Only when the throughput of each link is evaluated well can the real-time performance of message push be guaranteed.

In real-time IM messages, group chat messages and single chat messages are processed differently. You can read more about them if you are interested:


Implementation of IM message delivery guarantee Mechanism (1) : Ensure reliable delivery of online real-time messages”



How to ensure the efficiency and real-time performance of the push of large-scale group messages in mobile IM?”

8.2 How can I Ensure message Timing

In the technical implementation of IM, messages can be out of order in the following situations. (Hint: If you don’t know what IM message timing is, be sure to read The Introduction to Zero-based IM Development iv: What is IM System Message Timing Consistency?) ).

8.2.1) Sending messages over HTTP instead of a long connection may be out of order:

Because the back-end is generally clustered, requests may be sent to different servers using HTTP. Due to network delay or different processing speed of servers, messages sent later may be completed first, resulting in out-of-order messages.

Solution:

  • 1) The front end processes the messages in turn and sends the next message after sending one. This method will reduce user experience and is generally not recommended.
  • 2) Take a sequential ID generated by the front end and let the receivers sort by that ID. This way the front-end processing is a little more cumbersome, and during the chat the recipient’s list of historical messages may insert a message in the middle, which can be weird, and the user may miss the message. But this situation can be solved by reordering when the user switches Windows, with the recipient appending to the last of the messages each time it receives them.

8.2.2) In general, to optimize the experience, some IM systems may adopt asynchronous send confirmation mechanism (for example, QQ) :

That is, as long as the message reaches the server and the server sends it to MQ, the message is sent successfully. If the notification fails due to permission issues, the back end pushes another notification down.

In this case, MQ should choose the appropriate Sharding strategy:

  • 1) Sharding according to TO_user_ID: If this policy is used to perform multi-end synchronization, the synchronization between multiple ends of the sender may be out of order, because the processing speed of different queues may be different. For example, the sender sends M1 first and then M2, but the server may process M2 first and then M1. In this case, the other end will receive M2 first and then M1, and the session list of the other end will be confused.
  • 2) Sharding using conversation_ID: Using this policy will also cause synchronization to be out of order.
  • 3) Sharding according to from_user_id: In this case, this policy is a good choice.

In general, to optimize performance, it is possible to push MQ before pushing, in which case to_user_id is a good choice.

PS: In fact, there are many other possibilities that can cause IM messages to get out of order. I won’t go into them here.


How do I ensure timeliness and consistency of IM real-time messages?”



A low-cost method to ensure timing of IM messages”



A set of 100 million user IM architecture technology dry goods (part II) : reliability, order, weak network optimization, etc”

8.3 What Can I Do if A User Is Online

Many IM systems need to show users their status: whether they are online, busy, etc.

To store the online status of users, the following methods can be used:

  • 1) the Redis;
  • 2) Distributed consistent hashing.

Redis storage user online status:

Looking at the picture above, some may wonder: Why do you need to update Redis every time your heart beats?

If I’m using a LONG TCP connection would I not have to update it every time MY heart beats?

True: Normally the server only needs to update Redis when a connection is created or disconnected. However, as the server may be abnormal, or the network between the server and Redis may have problems, the event-based update will have problems, resulting in incorrect user status. Therefore, if the user’s online status needs to be accurate, it is best to use the heartbeat to update the online status.

Since Redis is stored in a single machine, in order to improve reliability and performance, we can use Redis Cluster or Codis.

Distributed consistency hash storage user online status:

When using distributed consistency hashing, you need to migrate the user Status before expanding or reducing the capacity of the Status Server Cluster. Otherwise, the user Status may be inconsistent at first. Virtual nodes are also needed to avoid data skews.

PS: Updating user status on the client is also a very challenging issue. Please read “Should I use” push “or” pull “in IM chat and group chat?” .

8.4 How to Perform Multi-Terminal Synchronization

8.4.1) Read diffusion:

As mentioned earlier: For read diffusion, message synchronization is mainly in push mode. The message IDS of a single session increase in sequence. If the front-end receives the pushed message and finds that the message ID is discontinuous, it requests the back-end to obtain the message again.

But it is still possible to lose the last message of the session.

In order to increase the reliability of message: can be in session history list session with a last message ID, front end when received a new message will pull the latest session list first, and then determine whether the session last message, if not, the message may be lost, front end need to pull a session of the message list; If the last message ID of the session is the same as the last message ID in the message list, the front end no longer processes it.

The performance bottleneck of this approach will be in the pull history session list, because each new message needs to pull the back end once, if viewed from the scale of wechat, the message alone may have 200,000 QPS, if the history session list in MySQL and other traditional DB, it will certainly not resist.

Therefore, it is best to save the list of historical sessions to Redis clusters with AOF on (which may cause data loss with RDB). Here can only sigh that performance and simplicity can not be both.

8.4.2) Write spread:

For write diffusion, multiterminal synchronization is simpler. The front end only needs to record the last synchronization point and bring the synchronization point with it. Then the server will return all the data after this point to the front end, and the front end will update the synchronization point.

PS: multi-terminal synchronization this is also the technical pain point in THE IM comparison pit father, interested in please move to “talk about mobile IM multi-point login and message roaming principle”.

8.5 What Should I Do If No Reading Is Displayed

In AN IM system, the processing of unread readings is very important.

The unread is generally divided into session unread and total unread. If not handled properly, the session unread and total unread may be inconsistent, seriously degrading the user experience.

8.5.1) Read diffusion:

For read diffusion, we can have both the session unread and the total unread in the back end, but the back end needs to ensure that the two unread updates are atomicity and consistent.

This can be achieved in the following two ways:

  • 1) Use The Multi transaction function of Redis, if the transaction update fails, you can retry. Note that transactions are not supported if you are using Codis clusters;
  • 2) Use Lua to embed scripts. To do this, ensure that the session unread and total unread are on the same Redis node (or Hashtag for Codis). This approach leads to decentralized implementation logic and increased maintenance costs.

8.5.2) Write spread:

For write diffusion, the server often weakens the concept of a session, that is, the server does not store a list of historical sessions.

The calculation of unread can be handled by the front end, the mark read and mark unread can record only one event to the mailbox, and each end can process the session unread by replaying the event.

Using this approach can lead to inconsistent unreads on all terminals, at least in the case of wechat.

If write diffusion is also used to store unread data through a historical session list, then the user timeline service is tightly coupled to the session service, and atomicity and consistency are required, and distributed transactions are used, which can significantly reduce system performance.

8.6 How Do I Store Historical Messages

Read diffusion: For read diffusion, only one copy is Sharding according to the session ID.

Write diffusion: For write diffusion, two copies need to be stored. One is the message list of the user Timeline and the other is the message list of the session Timeline. The user Timeline list can be Sharding by user ID, and the session Timeline list can be Sharding by session ID.

PS: If you’re not familiar with the concept of Timeline, read this article on How to synchronize and store Chat Messages in Modern IM Systems.

8.7 Separation of hot and cold Data

For IM, the storage of historical messages has a strong time series feature, and the longer the message is, the less likely it is to be accessed and the less valuable it is.

If we need to store historical messages for years or even forever (which is common in e-commerce IM), then it is necessary to separate hot and cold historical messages.

Data is separated between Hot and Cold in the hot-warm-Cold (HWC) architecture.

The newly sent messages can be stored in the Hot storage system (Redis) or Warm storage system. Then, the Store Scheduler migrates Cold data to the Cold storage system according to certain rules.

To obtain messages, you need to access the Hot, Warm, and Cold storage systems in sequence. The Store Service consolidates data and returns it to the IM Service.

The wechat team shared this article “Wechat background Based on the time sequence of massive data cold and hot classification architecture design practice”, may be some inspiration.

8.8 What should I Do for the Access Layer

For distributed IM, the access layer must be considered.

The following methods can be used to realize load balancing at the access layer:

  • 1) Hardware load balancing: such as F5, A10, etc. Hardware load balancing performance is strong, high stability, but the price is very expensive, not the rich companies do not recommend to use;
  • 2) Using DNS for load balancing: It is easy to use DNS for load balancing, but it takes a long time to implement load balancing if switching or expansion is required. Besides, the number of IP addresses supported by DNS for load balancing is limited and the supported load balancing policy is relatively simple.
  • 3) DNS + 4-layer load balancing + 7-layer load balancing architecture: for example, DNS + DPVS + Nginx or DNS + LVS + Nginx;
  • 4) DNS + 4 layer load balancing: Layer 4 load balancing is generally stable and rarely changes. It is more suitable for persistent connections.

For point 3) : One might wonder why add four layers of load balancing?

This is because tier 7 load balancers are CPU intensive and often need to be expanded or reduced. Large sites may need many tier 7 load balancers, but only a few Tier 4 load balancers are needed. Therefore, this architecture is useful for large applications with short connections such as HTTP.

Of course, if the traffic is not large, just use DNS + 7 load balancing. But for persistent connections, adding layer 7 load balancing Nginx is not so good. Because Nginx often needs to be reconfigured and reload is configured, the TCP connection will be disconnected on reload, causing a large number of disconnections.

For the long-connected access layer, we can introduce a scheduling service if we need a more flexible load balancing strategy or if we need to do grayscale.

As shown in the figure below:

The Access Schedule Service implements the allocation of Access services based on various policies.

Such as:

  • 1) According to the grayscale strategy;
  • 2) Distribute according to the nearest principle;
  • 3) Based on the minimum number of connections.

9. Put it at the end

After reading the above content, you should be able to understand deeply that it is quite difficult to achieve a stable and reliable distributed IM system with large users. There is a long way to go.

In the constant pursuit of better experience, higher performance, more load, lower cost, the path of IM architecture optimization is endless, so in order to delay the programmer to send less anxiety, you must take your time code, cool head is very uncomfortable drop ~~

PS: This article is mainly from the perspective of IM architecture design, for IM beginners is not easy to see to understand, it is recommended that beginners start from this article: “a beginner is enough: from zero development of mobile IM”.

10. References

[1] 58 A study on the application of real-time messaging systems in China [J]

[2] A set of mobile IM architecture design practice sharing for massive online users (with detailed text and text)

[3] A study on the synchronization and storage of chat messages in modern IM systems

[4] A set of 100 million users IM architecture technology dry goods (Part I) : overall architecture, service separation, etc

[5] IM Message ID Technology (2) : Wechat Mass IM Chat Message serial number Generation Practice (Disaster Solution)

[6] Rapid Fission: Witness the evolution of wechat’s Powerful background Architecture from 0 to 1 (2)

This post has been posted on the “Im Technosphere” official account.





▲ The link to this article on the official account is:
Click here to enter. The synchronous publish link is:
http://www.52im.net/thread-3472-1-1.html