Author: Liu Fan

Joined qunar IM team in May 2015, mainly responsible for system development and architecture design of office IM and customer service IM. Good at using Erlang language to develop high concurrency, high availability of application services. Dedicated to extensible IM system, modular components, using a set of services to achieve office, customer service, message push and other functions.

Qunar has realized its own set of perfect office IM and customer service IM system combined with the existing IM implementation in the market due to the business demand for IM system, as well as the functions and extensions to IM. It has the following important features: real-time, reliability, consistency, security, scalability, high concurrency.

What is IM

Instant Messaging (IM) is an online communication technology that provides real-time message transmission through the network. IM systems generally include IM clients, IM servers, networks, and messages transmitted between them.

The whole process is similar to we send packages: the user (the client) will write A has (message) to the sender and recipient’s parcel to the post office (server), the post office (server) according to the recipient of the information on the package (message), send the parcel (message) to the user (client) B, complete transmission of the message.

Common IM implementations:

Second, IM common implementation scheme

XMPP agreement

XMPP is an open XML protocol designed for quasi-real-time messaging and presence information as well as request-response services.

The XMPP protocol unit consists of three broad categories:

  • Presence: Presence determines the status of an XMPP entity and is used to tell the server whether the entity is online, offline, or busy.

  • Message: messages sent and received between users.

  • IQ: indicates request-response packets.

Pros: XMPP has a large number of good implementations and a good community environment with rich functionality and extension capabilities.

Disadvantages: Large packets consume network traffic and power.

The MQTT protocol

MQTT (Message Queuing Telemetry Transport) is a lightweight communication protocol based on publish/subscribe mode. MQTT is based on TCP/IP. Released by IBM in 1999. The biggest advantage of MQTT is that it can provide real-time, reliable messaging services for connecting remote devices with very little code and limited bandwidth. As an instant messaging protocol with low overhead and bandwidth consumption, it is widely used in Internet of Things, small devices, mobile applications and so on.

MQTT is a client-server based message publish/subscribe transport protocol. The MQTT protocol is lightweight, simple, open and easy to implement, which makes it applicable to a wide range of applications. Just because of its simplicity, it also brings its disadvantages: you need to realize the IM logic of chat, friends and so on.

Third, Qunar implementation

Protocol selection

Based on the common IM protocols investigated above, we finally choose XMPP protocol as the most initial implementation protocol:

  • Because basic chat functions can be implemented with minimal development effort;

  • You can use existing extensions to achieve more functions faster.

  • In view of the shortcomings of XMPP, specific modifications are made to change the stage of network transmission into the Protocol buffer to make up for the shortcomings of network traffic and electricity.

Open source project selection

Based on:

  • Ejabberd is an instant messaging server based on Jabber/XMPP protocol.

  • Licensed by GPLv2 (free and open source);

  • Developed using Erlang/OTP, it is characterized by cross-platform, fault-tolerant, clustering and modules;

  • Ejabberd is one of Jabber/XMPP servers with the best scalability. It supports the distribution of multiple servers and has fault-tolerant processing. The failure of a single server does not affect the operation of the whole cluster.

  • Erlang’s scheduling and GC strategies are better suited to IM’s real-time requirements (see Resources).

The team eventually chose to use the ejabberd open source implementation to quickly implement their IM capabilities.

Architecture design

  1. The client communicates with the server through two connections
  • TCP Long Connection (Websocket for the Web) : This connection exchanges state-related or multi-endpoint synchronization packets.

  • HTTP connections: state independent or do not require multi-endpoint synchronization.

2. Load balancing

  • TCP long connections use LVS or HA to perform load balancing.

  • HTTP uses Nginx for load balancing.

  1. data
  • The data is either put directly into the database or into MQ, and the supplier subscribes for consumption;

  • High frequency access data in Redis, provide frequent query, reduce data pressure.

  1. Management of maintenance
  • Provides Intranet interfaces for other systems to extend IM functions.

  • Provides monitoring and maintenance tools to facilitate system maintenance and troubleshooting.

Ejabberd architecture

The message flow process is as follows:

  1. The client sends the message to the server through the long connection and enters the ejabberd_c2s process which is responsible for the connection.

  2. After the ejabberd_c2s process is finished, the ejabberd_router:route(From,To,Stanza) is called To route the message.

  3. Ejabberd_router handles its own public logic;

  4. Then, if the message is addressed to the IM system, it is sent to the ejabberd_local process.

  5. Ejabberd_local determines that if To is a specific person, the message will be sent To the ejabberd_sm process responsible for the user connection.

  6. The ejabberd_sm process queries the user who received the message, has several online devices, and then sends the message to the user’s multiple EJabberd_c2s processes.

  7. The ejabberd_c2s process sends the message to the client over its own TCP connection.

In this way, A message is completely transmitted from user A to user B, realizing instant communication.

  • For the parts not mentioned in the figure above are: Communicating with another IM (ejabberd_s2s), server local processing (Processedbyserver)

Ejabberd function extension

In the ejabberd architecture diagram in the previous section, we can add hook functions at each step to add our own extensions. The extensions we added are:

Use the protocolBuffer protocol

The XMPP protocol has the following disadvantages: XML packets have large traffic and high power consumption. Therefore, the protocolbuffer is used to replace XML packets to transfer the traffic between the client and the server, reducing the packet traffic and reducing the power consumption of the mobile phone. Convert the ProtocolBuffer to XML format used by ejabberd server on the server to reduce im logic modification on the server.

Message reliability

We use the following channels to ensure the reliability of IM messages:

  • When the device is online, a message acknowledgement receipt is used to ensure that the message is correctly sent and received.

  • If the device is offline and logs in to the system again, all historical messages during the period from the last logout to the current login are pulled through the HTTP interface.

  • Each message has a unique ID to ensure idempotency when sending a message. Sending the same message repeatedly only displays and stores one message.

Message acknowledgement receipt

When the ejabberd_c2s process receives the message, we add a hook function that serves as the server’s acknowledgement of receipt of the message and also returns the timestamp to the client as the timestamp of the message.

The message acknowledgement receipt mainly solves the following problems: Ensure that the message is sent successfully at least once. Only after receiving the acknowledgement receipt from the server, the client considers that the message has reached the server and received the response. Otherwise, the client will assume that the message did not reach the server and failed to send.

Message synchronization

When we log in to multiple devices with the same account at the same time, we need to be aware of the messages sent and received on other devices and display them to users synchronously. Therefore, when sending and receiving messages, we need to do necessary processing to realize the function of multi-device synchronization.

Synchronous sending of messages to other devices:

Publish messages to message queues

In order to expand IM functions, we need to publish all IM messages and time to the message queue for other systems to subscribe and consume, and realize the statistics, analysis and storage of messages. So we put time and message classification into Kafka for asynchronous processing of messages.

Currently, the following events are sent: messages, online and offline events, trap events, and @ events

HTTP interface for sending messages

In order to provide the service of sending messages to other systems, we provide HTTP interfaces to simulate messages from users to achieve this function.

Example Deliver IM authentication credentials

We can embed other systems in IM to expand IM capabilities, including but not limited to mobile OA, operation and maintenance alarm systems. When an IM client switches to another system, it carries the IM authentication certificate, and the other system invokes the IM interface to authenticate its identity. If the authentication succeeds, the user accesses the SYSTEM through the IM client. Otherwise, the user is not allowed to access the system. In this way, duplicate identity authentication is avoided and office efficiency and user experience are improved.

The process of IM authentication credentials is as follows:

  • After a long IM connection is established and authentication succeeds, the server sends a token to the client over the long IM connection

  • A client must carry a token when requesting the IM HTTP interface for identity authentication

  • When the client opens other trusted systems, it will also bring the token for identity authentication in other systems

  • When the long connection between the client and the server is disconnected, the server destroys the token to make it invalid.

Incremental pull

On some data pulls that need to be synchronized between the client and the server, we use incremental update logic to reduce the response data set of each server and speed up the login and synchronization process of the client.

In terms of implementation, IM uses the updatetime as the query key, and the server updates the updatetime field every time data is updated. When the client logs in again, the server uses the latest update time to pull data. If the server has data that is newer than the client, the server returns the incremental data to the client.

The application scenarios include:

  • Organizational structure renewal

  • Message History Update

  • Group list update

  • Friends List update

  • Profile update

IM Function Extension

Robot implementation

Since we need to implement some kind of self-service or intelligent reply via IM, we need to implement this function on the IM extension.

Firstly, we have the queue service function of all messages. Then, we subscribe all messages based on the message queue, and forward the messages sent to a specific robot to the corresponding robot service according to the system configuration. When the robot service receives the message sent to itself, it performs the corresponding operation through its own system configuration or the problem library of autonomous learning, and calls the IM message sending interface to return the specific message to the consulting user.

We can realize various self-service and intelligent services in this way. Save labor cost and improve office efficiency.

Realization of customer service System

For the customer service system, there are some differences from normal IM. In the user’s mind, he is chatting with a store or an official customer service. In fact, there may be multiple different customer services behind it, and logic such as queuing and session timeouts may be used, so it is necessary to extend common IM functions.

The customer service system subscribes to all IM messages. When a user sends a message to the customer service, the customer service system needs to queue for consultation, allocate customer service, establish a session, and then convert the message sent by the user to the customer service into a message sent to a specific customer service, and then send it to the customer service.

User -- -- -- -- -- -- -- -- -- -- -- -- -- > shop into a shop -- -- -- -- -- -- -- -- -- -- - > customer service customer service -- -- -- -- -- -- -- -- -- -- -- -- -- > shop into a shop -- -- -- -- -- -- -- -- -- -- - > userCopy the code

The user side:

The side of the service:

4. Data indicators


For the main indicators of IM, we mainly pay attention to the following:

  • Concurrent on-line number

  • The amount of TCP established

  • Message volume

Below is the actual value of the corresponding indicator

Number of concurrent online: about 20W

The amount of TCP established (QPS) : about 3W

QPS received: about 3W

Quantity of messages sent (QPS) : about 3W

Optimize system parameters

Since the long connection service needs to support a large number of TCP connections at the same time, the default system configuration cannot meet the system requirements. We need to change our system configuration according to the requirements to achieve the optimal performance.

Linux operating system parameters

Maximum number of file handles that can be allocated globally:

\# 2 millions system-wide
sysctl -w fs.file-max=2097152
sysctl -w fs.nr\_open=2097152
echo 2097152 > /proc/sys/fs/nr\_open
Copy the code

Number of file handles allowed for the current session/process:

`ulimit` 
`-`
`n` 
`1048576`
Copy the code

/etc/sysctl.conf

Persist ‘fs.file-max’ to /etc/sysctl.conf:

`fs`
`.`
`file`
`-`
`max` 
`=`
 
`1048576`
Copy the code

/etc/systemd/system.conf Sets the maximum number of service file handles.

`DefaultLimitNOFILE`
`=`
`1048576`
Copy the code

/etc/security/limits.conf

/ etc/security/limits. Conf persistent Settings/process allows the user to open the file handle count:

\*      soft   nofile      1048576
\*      hard   nofile      1048576
Copy the code

TCP stack network parameters

Concurrent connection Backlog Settings:

sysctl -w net.core.somaxconn=32768
sysctl -w net.ipv4.tcp\_max\_syn\_backlog=16384
sysctl -w net.core.netdev\_max\_backlog=16384
Copy the code

Available well-known port range:

`sysctl` 
`-`
`w net`
`.`
`ipv4`
`.`
`ip_local_port_range`
`=`
`'1000 65535'`
Copy the code

TCP Socket Buffer Settings:

sysctl -w net.core.rmem\_default=262144
sysctl -w net.core.wmem\_default=262144
sysctl -w net.core.rmem\_max=16777216
sysctl -w net.core.wmem\_max=16777216
sysctl -w net.core.optmem\_max=16777216

#sysctl -w net.ipv4.tcp\_mem='16777216 16777216 16777216'
sysctl -w net.ipv4.tcp\_rmem='1024 4096 16777216'
sysctl -w net.ipv4.tcp\_wmem='1024 4096 16777216'
Copy the code

TCP connection tracing Settings:

sysctl -w net.nf\_conntrack\_max=1000000
sysctl -w net.netfilter.nf\_conntrack\_max=1000000
sysctl -w net.netfilter.nf\_conntrack\_tcp\_timeout\_time\_wait=30
Copy the code

Maximum number, recycle, and reuse of time-wait sockets:

Net.ipv4. TCP \_max\_tw\_buckets=1048576 # TCP \_tw\_recycle = 1 # net.ipv4. TCP \ _TW \_reuse = 1 fin-waIT-2 Socket timeout setting is not recommended. net.ipv4.tcp\_fin\_timeout = 15Copy the code

Six, summarized

By implementing the basic IM features, as well as various extensions, we have summarized some core IM features:

  • Provides stable TCP long-connection services

  • Provide unified authentication services

  • Provide high-performance message subscriptions and the ability to send messages to other services

Reference:

  • [1] Erlang GC document

  • [2] Erlang Garbage Collection Details and Why It Matters

  • [3] Erlang Scheduler Details and Why It Matters

  • [4] Ejabberd massive scalability: Single node with 2+ million concurrent users

  • [5] The evolution of the GC Heap model

END