[TOC]

Instant messaging IM technology field basics

Instant messaging advanced in the field of IM technology

issue

Preparation work (Protocol selection)
- Network transmission protocol selection and data communication protocol selection
XXX Project Structure
- Architecture pros and cons
- Road to architecture improvement
Key technical points of IM & strategic mechanism
- How to ensure that messages are not lost/out of order/repeated
- The heartbeat strategy
- Reconnection strategy
Typical IM service scenario
- User A sends an im to user B
- User A sends A message to group C
Storage structure analysis

Preparation work (Protocol selection)

Which network transport protocol (TCP/UDP/HTTP) is chosen?

Udp protocol has better real-time performance, but how to deal with safe and reliable transmission and message interaction between different clients is a difficult problem, which is too complicated to implement. Most IM architectures are not implemented using UDP.
But why do you need HTTP?
- Core TCP long connection, used to send and receive messages in real time, other resource requests do not occupy this connection, to ensure real-time
- HTTP can be used to implement state protocols (developed in PHP)
  - Circle of friends
  - User personal information (friend information, account number, search, etc..)
  - Pull mode is used for offline messages, preventing the TCP channel pressure from affecting instant message delivery efficiency
  - And so on…
- IM for picture/language/big doodle chat: HTTP is very easy to handle breakpoint continuation and shard upload functions.
TCP: Maintains the long connection to ensure the real-time performance of messages.
- Purpose: To send and receive messages in time

What data communication protocol is chosen?

The principles for selecting IM protocols are as follows: Easy to expand, convenient to cover various service logic, and saves traffic. The need to save traffic is especially important on mobile IM!!
- XMPP: Open source protocol, extensible, in each end (including the server) has a variety of languages, easy access for developers. But there are many drawbacks :XML is weak, has too much redundant information, has a lot of traffic, and has a lot of sinkholes in actual use.
- MQTT: The protocol is simple and the traffic is low. However, it is not designed for IM and is mostly used for push. You need to implement groups, friends, etc. (currently the company uses MQTT to implement a general IM framework).
- SIP: a text protocol, mainly used for voIP-related modules. SIP signaling control is complex
- Proprietary protocol: Implement your own protocol. Most mainstream IM apps use private protocols. A well-designed private protocol generally has the following advantages: high efficiency, saving traffic (binary protocol is generally used), high security, and difficult to crack. The XXX project is basically a private custom protocol (referring to Mogujie open source TeamTalk), and the later general IM architecture uses MQTT
Considerations for protocol design:
- Network data size, bandwidth, transmission efficiency, though for a single user, the amount of data transmission is very small, but for the server to withstand numerous high concurrency data transmission, must consider the data bandwidth, try not to have redundant data, so as to occupy less bandwidth, less resource-intensive, less network IO, improve transmission efficiency;
- Network data security — network security of sensitive data: some data transmitted in related businesses are sensitive data, so it must be considered to encrypt some transmitted data (XXX project currently provides C++ encryption library for the client to use)
- Encoding complexity — serialization and deserialization complexity, efficiency, scalability of data structures
- Protocol generality – General specification: Data types must be cross-platform and data formats common
Common serialization protocols
- Open source protocols that provide serialization and deserialization libraries: PB, Thrift. Extensions are quite convenient, serialization and deserialization is convenient (the XXX project currently uses PB)
- Text protocol: XML, JSON. Serialization, deserialization is easy, but occupy large volume (generally HTTP interface adopts JSON format).

XXX project system architecture

The early stage of the architecture

Improved architecture

Strengths and weaknesses of architecture

advantages

TCP and HTTP are supported, and services that are not related to each other are isolated
- php server
- router server
- user center
- Access server
- oracle server
The service supports parallel expansion, which is convenient and insensitive to users
Cache DB layer encapsulation, business callers directly call the interface.
Services are stateless except Access Server, which is stateful
Services communicate with each other, via RPC, across machines.
Oracle is modular, kind of like MVC, decoupled code, decoupled functionality.

disadvantages

Oracle is so big that it can siphon off certain businesses
- disadvantages
  - The business is too large, and it is inconvenient for multiple developers, which may cause code conflicts
  - If a small function fails, the entire service may become unavailable
- To improve the
  - Oracle is coupled with APNS Server, so APNS can be extracted separately (XXX project has started to access the general push system, which is similar to extracting APNS).
Push Server has no business and merely forwards requests between Access and Oracle
- disadvantages
  - A relatively weak service needs to be maintained separately, which increases the operation and maintenance cost
- To improve the
  - Incorporating Push Server into Access removes a layer of RPC call intermediates. Reduce operation and maintenance costs and improve efficiency (the new architecture of XXX project has integrated push Server into Access)
Access Server is closely connected with users. While maintaining the long connection, it also has some services
- disadvantages
  - Maintain a long connection, if the update, is bound to affect the online user’s connection status
  - Occasionally partial services reduce the stability of long connections
- Improvement:
  - Extract a Connd server from the Access Server that maintains long connections:
    - Just maintain long connections, send and receive packets. Uncoupling any business (XXX project is currently improving this architecture and is not yet online)

IM key technical points

Technical point 1: How to ensure that messages are reachable (not lost)/ unique (not repeated)/ sorted (not out of order)

The simplest order preserving (not out of order)

Why might it be out of order?
- When it comes to online messages, there is no problem in normal circumstances
  - However, what if the network fails and the message cannot be received?
    - The server will resend or go to offline storage (XXX project mechanism goes to offline storage immediately)
- For offline messages, there can be many.
  - When pulling, it usually pulls offline messages all at once
    - When receiving multiple messages, ensure that the messages are received sequentially.
How to ensure not out of order?
- When each message is sent to the server, a globally unique MSGID is generated. This MSGID must be incrementally increased (the generation of msgIds is synchronized to ensure uniqueness in concurrency).
- For each message, there is a message generation time, down to the millisecond
- When pulling multiple messages, sort the data according to the size of the MSGID.

Ensure uniqueness (no duplication)

Why might messages be repeated?
- Due to the instability of the mobile network, a message may not be sent out on a certain day, or an ACK response may not be received after it is sent out.
  - In this case, a retransmission mechanism may be required. This mechanism may be required on both the client and server sides.
  - Since there is a repetition mechanism, it is possible to receive duplicate messages.
How do you solve it? Ensure no duplication of client – and server-side processing is preferred
- Add a field isResend to the meta structure of the message. Set this field when the client sends the message repeatedly to indicate that the message is repeated and the server will use it for subsequent judgment
- The server caches a batch of the most recent MsGids (called localMsGids) for each user, such as 50
- After receiving the message, the server determines whether isResend and the msGID are in the localMsgId list. If the packet is sent repeatedly, the server does not perform subsequent processing.
- Because isResend alone cannot be prepared to tell, because maybe the client does resend, but the server does not receive……

Guaranteed reachable (not lost and not heavy)

In the simplest case, the server needs an ACK for each message sent to the receiver to ensure reachability
- However, ACK can also be lost in weak network environment.
The client may not receive the data returned by the server, or the client may not respond to the data received by the server.
- Therefore, there must be a good acknowledgement mechanism to inform the client that it has been received. Once and only.
Consider what happens when an account logs in from different terminals.
- Messages must be able to be sent to the currently logged terminal without being able to re-send or pull previously pulled data.

Technical point 2: msgID mechanism

Here are two solutions for reference (the essence is the same, but the implementation is different)

Sequence number MSGID mechanism & MSGID confirmation mechanism (Option 1):

Each message from each user must be assigned a unique MSGID
The server stores the MSGID list for each user
The client stores the largest MSGID that has been received

Advantages:

Based on the sequence difference between the server and the mobile phone, you can easily incrementally deliver messages that are not received by the mobile phone
In the case of weak network environment, packet loss probability is relatively high, and the packet return from the server often fails to reach the mobile terminal. The mobile phone will update the local sequence only after receiving the message. Therefore, even if the packet return from the server is lost and the mobile phone waits for a timeout to collect the message from the server again, the undelivered message can be correctly received.
Since the sequence stored by the mobile phone is the largest sequence for confirming receiving messages, each time the mobile phone receives messages from the server can also be regarded as a confirmation of the last received message. If an account is logged in to from multiple mobile phones in turn, the server can ensure that confirmed messages are not delivered repeatedly as long as the server stores confirmed sequence messages from mobile phones. After logging in to different mobile phones in turn, other mobile phones cannot receive received messages.

The situation that users obtain messages when they log in to different terminals

If mobile phone A receives messages from the server with Seq_cli = 100 and Seq_svr = 150 of the server, mobile phone A can receive messages with sequence [101-150] and set the local Seq_cli to 150

Sequence number MSGID mechanism & MSGID confirmation mechanism (Option 2: current option of XXX project):

Each message from each user must be assigned a unique MSGID
The server stores the MSGID list for each user
The client stores the largest MSGID that has been received
- For single chat, group chat, anonymous storage respectively (the id of a person, the ID of a group).

thinking

What are the advantages and disadvantages of both approaches?

In method 2, the acknowledgement mechanism is always one more HTTP request. But it can ensure timely elimination of data
In the first method, the confirmation mechanism is to wait until the next time when the data is pulled to confirm, without additional requests, but the elimination of data is not timely.

Technique # 3: Heartbeat strategy

Heartbeat function: Maintaining TCP long connections and ensuring the stability of long connections. Is this function only applicable to mobile networks?

The heart actually does two things
- Heartbeat Ensures the connection between the client and the server, and the server checks whether the client is online
- The heartbeat also needs to maintain the GGSN of the mobile network
  - Network AddDress Translation (NAT) is used by carriers to translate the IP address of the mobile internal network and the IP address of the external network, so as to finally connect to the Internet. The GATEWAY GPRS Support Node (GGSN) module is used to realize the NAT process. However, in order to reduce the load of the mapping table of gateway NAT, most carriers will delete the corresponding table of a link if there is no communication for a period of time, resulting in link interruption. Therefore, carriers deliberately shorten the release timeout of idle connections to save channel resources. However, such intentional release may lead to passive disconnection of our connection (the heartbeat was disconnected by the operator before XXX project, and the heartbeat strategy has been improved later, and will continue to be improved later).
  - NAT scheme namely public IP broadband users will past each independent distribution way to distribution network IP for each user, operator, and then to access the user unified deployment of NAT equipment, is the role of NAT users of network connection network IP, in the form of a port to translated into public IP and external network resources to connect again.
  - The IP address from mobile to GGSN is an Intranet. NAT/PAT is performed on the GGSN to translate the IP address into the PUBLIC address pool of the GGSN. Therefore, the IP address displayed on your mobile phone on the Internet is the public address pool of this IP address pool
The most common is to send a heartbeat every four and a half minutes, but that’s not smart.
- The reason for 4 and a half minutes is the combination of NAT timeout times of different mobile operators
- The heartbeat duration is too short, which consumes traffic/power and increases server pressure.
- If the heartbeat duration is too long, the network may be disconnected because the corresponding entries in the NAT table are eliminated by carrier policies
Intelligent Heartbeat Strategy
- Maintain mobile network GGSN(Gateway GPRS support node)
  - When there is no data communication on a link for a period of time, most mobile wireless network operators eliminate the corresponding entries in the NAT table, resulting in link interruption. NAT timeout is an important factor affecting the lifetime of TCP connections (especially in China). Therefore, the client automatically calculates the NAT timeout to dynamically adjust the heartbeat interval, which is an important optimization point.
- Refer to a set of adaptive heartbeat algorithms of wechat:
  - In order to ensure the experience of receiving messages in a timely manner, a fixed heartbeat is used when the APP is in the foreground active state.
  - When the app enters the background (or the foreground turns off the screen), use the minimum heartbeat to maintain the long link. Then enter the background adaptive heartbeat calculation. The purpose of this operation is to select the time period when users are not active as much as possible to reduce the impact of delayed message collection caused by heartbeat calculation.
Simplify heartbeat packets to ensure that the size of a heartbeat packet is less than 10 bytes, and adjust the interval of heartbeat packets according to the status of APP (mainly Android)

Technical point four: disconnection reconnection strategy

After the offline, different reconnection intervals are selected according to different state requirements. If the local network fails, you do not need to periodically disconnect the network. In this case, you only need to monitor the network status and reconnect the network after it recovers. If the network changes very frequently, especially when the App is running in the background, certain frequency control can be added for reconnection, so as to ensure the real-time performance of certain messages and avoid excessive power consumption.

The shortest interval of disconnection reconnection is 4, 8, 16… (Max. 30) sequence execution to avoid frequent disconnection and reconnection, thus reducing the server load. This policy is reset when the server receives the correct package
If there is a network but the connection fails, the interval is 2, 2, 4, 4, 8, 8, 16, 16… (Max. 120) is continuously retried
In order to prevent the emergence of the avalanche effect, we detected the socket failure (server), reconnection is not immediately, but let the client random Sleep for a period of time (or the other strategy) to connect to the server again, so you can make different client on the server restart time not to connect at the same time, resulting in an avalanche effect.

Typical IM service scenario flow

User A sends an im to user B
- A Obtain the token using the account password.
- A uses the token to login
- The server caches user information and maintains login status
- A Packages data and sends it to the server
- The server checks whether user A is A risky user
- The server checks messages for sensitive words (this is important)
- The server generates msGID
- Friend detection on server (A/B)
- The server performs repeat send detection
- The server obtains the connection information of USER B and determines the online status
- If it is online, it is directly sent to B and merged into the cache and DB
- If not online, store it directly. In the case of ios, apNS is performed.
- B, who is online, replies an ACK for confirmation after receiving the message.
User A sends A message to group C

Storage structure

List of unread indexes

The purpose of unread message index is to ensure the reliability of messages and to serve as an index structure for offline users to obtain the list of unread messages.
The unread message index consists of two parts, both of which reside in Redis:
- A hash structure that logs unread data for each of the user’s friends
- Each friend corresponds to a Zset structure, which holds the ids of all unread messages.
Let’s say A has three friends B,C and D. A offline. B sends 1 message to A, C sends 2 messages to A, and D sends 3 messages to A. Then, the unread index structure of A is as follows:
Hash structure
- B-1
- C-2
- D-3
Zset structure

User	MsgId 1	MsgId 2	MsgId 3
B	1	–	–
C	4	7	–
D	8	9	10

The message upstream and queue update unread message index means that the field corresponding to the hash structure is increded by one and the message ID is appended to the zset structure of the corresponding friend.
Receiving an ACK maintains an index of unread messages, by contrast, the field corresponding to the hash structure is reduced by one, and the message ID is removed from the zset structure in the corresponding friend.

Message downlink (acquisition of unread messages)

This process obtains unread messages from offline users.

This process is primarily served by the Sessions/Recent interface. The process is as follows:

Hgetall reads the hash structure in the index of unread messages.
Iterate over the hash structure. If no reading is not 0, read the zset structure of the corresponding friend and get the list of unread message ids.
The message content is read from the list of message ids to the cache (or through to the database) and sent down to the client.

Similar to the online process, the offline client sends and receives an ACK to the service end to inform the service end that the unread message has been successfully delivered. The service end maintains the index of the unread message.

Unlike the online process, this receive ACK is implemented by calling the Messages /lastAccessedId interface. The client needs to send a hash structure to the server. The key is the ID of the friend sent through the Sessions/Recent interface and the value is the MAXIMUM ID of the friend in the unread message list of the sessions/ Recent interface.

The server receives the hash structure and iterates over it

Clear corresponding cache
Use the ZremrangebyScore operation to clear the zset structure of the corresponding friend
Subtract the return value of zremrangeByScore from the hash structure in the unread message index

This completes the maintenance of the unread message index in the offline process.

Queue processing flow

If the message is marked offline, the message is stored, the cache is written (only offline messages are written to the cache), the unread message index is updated, and apNS is called to push.
If the message is marked online, the message can be directly stored because B has already received the message.
If the message is marked as REDELIVER, it is written to the cache and then apNS is called for push.

Question after discussion

Is it necessary to separate Access from Connd Server?

Purpose of separation:
- The connection layer is more stable
- Reduce restart and facilitate Access service upgrade
Can it really have such an effect?
- The connection layer is more stable – hard metrics are needed to determine whether it is more stable, as Access is not heavy and is not currently a bottleneck.
  - Currently Access services are not heavy, is it really necessary to split out?
  - If you want to split it, it’s not a split, it’s a split on Oracle, sort of a microservices concept
  - Stability is not so reflected, the original connd design, thinner does not bear business, and now access still has some business logic, so it is relatively high possibility of upgrading.
  - The purpose of the Access split is to make the layer that holds the connection thin enough that it doesn’t have to update the code (TCP doesn’t break) to change the business.
- Reduce resets and facilitate Access service upgrades – you cannot add a layer of services to enable resets and upgrades, other mechanisms are needed to ensure that the server can be upgraded without affecting users over TCP long connections
  - There is still a possibility that connd Server will need to be rebooted after being split. Key issues remain unresolved
  - By adding a layer of services, connd is intended to manage connections only through shared memory. When access updates, users will not be disconnected.
- If a service is added, one more link is added, which may lead to a long service chain. As more services are requested, the service becomes more unavailable. Because it is difficult to ensure 99.999% availability for each service (five nines), adding one service reduces the availability of the entire service.
The improvement of architecture must be supported by data, and only with data output can it be proved that the improvement is effective. Otherwise, it will be useless to spend two months on improvement, wasting manpower and time, and reducing the development efficiency
- The architecture of each phase may be different, depending on the audience and popularity of the current phase

How to ensure that access layer services are restarted and upgraded? Service expansion/reduction?

Solution: add a signaling interaction, if the server wants to restart/reduce capacity, inform all clients connected to this Access, the server needs to upgrade, clients need to reconnect to other nodes
- This is actually an active migration strategy, so that the client is still reconnected, rather than disconnecting.
After all clients on the current Access node are connected to other nodes, the current Access node can be restarted, offline, or scaled down.
How to expand capacity? If capacity expansion is required, add new nodes and use etCD to discover and register services. The client requests data through the Router Server and pulls the data to the node.
If the current 3 nodes can not carry, add 2 nodes, at this time, to immediately relieve the pressure of the current 3 nodes, what should be done?
- According to the previous method, if the client re-logs in to request the Router server and then connects to the router server, it cannot immediately relieve the pressure, because the current pressure is still the same as the previous several nodes after the new nodes are added
- Therefore, the server needs to have a better mechanism for the server to control
  - The server sends commands to clients on the current node to connect to the new node.
  - The server also needs to determine whether parts are connected to other nodes, and then have policies accordingly.

How to Prevent attacks

All online machines have firewall policies (including hardware/software firewalls)
- Hardware firewall: hardware firewall equipment, very expensive, there is purchase, but with less
- Software firewall: On the software layer, for example, IPtable, set a firewall policy for the IPtable
At the TCP channel level
- Socket connection speed frequency control, do not let others always establish socket connections, otherwise the socket is easy to fill up, can not support
  - If the connection speed of an independent IP address exceeds 100/s, the IP address is considered to be attacked and blocked
- The frequency of sending and receiving messages is controlled so that no one can send messages all the time, otherwise the whole service will die
  - To be able to send messages, you must be logged in
  - To log in, you have to have a token, a secret key
  - Frequency control can also be set for sending and receiving messages

A comparative selection of open source/generic protocols currently on the market

Why isn’t XMPP a good fit, just because XML data is big?
- There are also solutions that are optimized for XMPP. So heavy traffic is not the main disadvantage
- Another point is that the message is not reliable, and its request and reply mechanism is mainly designed for stable long connection network environment, which is not particularly optimized for narrow bandwidth and unstable long connection mobile network
- Therefore, XMPP, designed to support multi-terminal states, is not a strong suit in the mobile world
Why is MQTT not suitable? Why did the XXX project not use MQTT?
- MQTT is suitable for push, not IM, and requires additional processing at the business level. It has been used again
- The use of MQTT in XXX project is a problem left over from history. At the very beginning, it was necessary to carry out the project quickly and build the architecture quickly, so it was used in Teamtalk of Mogujie.
- If there are no legacy issues, then MQTT will be used
In addition to the large amount of data, the complexity of the protocol, the complexity of the client and the server to process the protocol, right?
- The protocol should be easy to expand, facilitate the subsequent addition of fields, and support multiple platforms
- Consider whether the client and server implementations are simple
- Codec efficiency

Cross-room and multi-room Dr

Services need to be able to span machine rooms, especially stateful nodes.
Multiple equipment rooms must be reserved for DISASTER recovery (Dr) to prevent the entire equipment room from failure.

We just discussed what functions the access layer has:

Maintains TCP long connections, including heartbeat and timeout detection
Receiving unpacking
Anti-attack mechanism
Waiting for a received message to respond

Thinking Points (Key points of assessment)

Why might messages be out of order? How do I keep messages out of order?
- Considering the offline
- Consider network exceptions
How to design the storage mode/storage structure for offline messages?
- Consider how many people will send the message
- Consider caching + DB
How to ensure that the message is not lost, not heavy? How to design message loss prevention mechanism?
- Consider the possibility of multiple terminals logging in with the same account
- Consider that ACK may also be lost in a weak network environment
For long connections, how are they managed?
- Consider quick lookup
  - How can I quickly find the connection for this request
There are multiple access layer nodes and they are stateful. What mechanism is there to ensure that a request sent from node 1 will return to node 1?
- Or what’s the downside if the response doesn’t go back to node 1, it goes back to node 2?

Welcome to follow my wechat official account: Linux server system development, and we will vigorously send quality articles through wechat official account in the future.

Instant messaging IM technology field basics

Instant messaging IM technology field basics

issue

Preparation work (Protocol selection)

Which network transport protocol (TCP/UDP/HTTP) is chosen?

What data communication protocol is chosen?

XXX project system architecture

The early stage of the architecture

Improved architecture

Strengths and weaknesses of architecture

advantages

disadvantages

IM key technical points

Technical point 1: How to ensure that messages are reachable (not lost)/ unique (not repeated)/ sorted (not out of order)

The simplest order preserving (not out of order)

Ensure uniqueness (no duplication)

Guaranteed reachable (not lost and not heavy)

Technical point 2: msgID mechanism

Sequence number MSGID mechanism & MSGID confirmation mechanism (Option 1):

The situation that users obtain messages when they log in to different terminals

Sequence number MSGID mechanism & MSGID confirmation mechanism (Option 2: current option of XXX project):

thinking

Technique # 3: Heartbeat strategy

Technical point four: disconnection reconnection strategy

Typical IM service scenario flow

Storage structure

List of unread indexes

Message downlink (acquisition of unread messages)

Queue processing flow

Question after discussion

Is it necessary to separate Access from Connd Server?

How to ensure that access layer services are restarted and upgraded? Service expansion/reduction?

How to Prevent attacks

A comparative selection of open source/generic protocols currently on the market

Cross-room and multi-room Dr

We just discussed what functions the access layer has:

Thinking Points (Key points of assessment)

Related Posts

Introduction to Open Policy Agent(OPA)

Wrote a concise Typora+Markdown resume template

Linkerd stable-2.11.0 Stable release: Licensing policies, gRPC retry, performance improvements, and more!