The foreword 0.

According to the opening | mPaaS core components of a service system of outline introduction, we already know the MPS mPaaS push service mainly provides professional mobile news scheme, can provide a variety of push type for different scene, meet the demand of the user’s personalized push, And integrated the push function of apple, Huawei, Xiaomi, OPPO, VIVO, FCM and other manufacturers channels. In addition to the console rapid push capability, MPS also provides a server access solution for users to quickly integrate the mobile terminal push function and maintain interaction with users, effectively improving user retention and user experience.

This paper will further introduce the architecture design and core business processing process of MPS server. Firstly, we will look at the positioning of MPS in mPaaS:

From the figure above, it can be seen that MPS push service is one of the core essential basic components in mPaaS system to communicate with clients directly, and its basic principle is the transmission of “message notification” related service data based on TCP long connection channel.

At present, all apps in the Ant system have been connected with message push service, and through the evolution of historical architectures, the connection with the unified access gateway, and the unified use of the MMTP transmission protocol developed by Ant, the basic service that can support multiple apps, the total number of users exceeds one billion, and pushes billions of message notifications every day has been formed.

MPS also supports the HTTP2 standard protocol to meet the requirements of heterogeneous clients and App standard management and control in other industries. To support lightweight deployment, MPS still retains an independent access gateway (McOmetgw). You can select the corresponding access gateway according to the actual deployment scenario and the network SDK inherited by the client.

  • Note 1: Spanner is a secondary system developed by Ant Group based on Nginx. It mainly undertakes the functions of SSL offloading, HTTP and TCP access layer load balancing, and is one of the traffic entry systems of Ant Group. Using standard configuration files and self-developed Nginx modules, Spanner can support traffic forwarding between zones, blue-green publishing, and Dr Switchover in the LDC architecture.

  • Note 2: McOmetgw is the access gateway for MPS teams before they access Spanner. It is also based on the secondary development of Nginx. Its protocol processing can only be connected with MPS service, but not with MGS, MSS and other mPaaS components.

At the access layer, in addition to the access gateway required by the message notification itself, mPaaS mobile gateway (MGS) still needs to cooperate with the service. Its main functions are: the client performs device registration, user binding, and relationship binding of third-party channels through RPC. In addition, the log gateway MDAP is also used independently at present. Its main purpose is to collect and upload the client behavior log burying point according to the established specifications for monitoring and analysis system to analyze related data and make reports. Considering the priority and frequency of log usage and the design of network isolation from the main service link, logs are still uploaded in HTTP/HTTPS mode. After client logs are reported, Log collection tools and message queues deployed in application services will be delivered to data warehouses or other data analysis systems for analysis, report making and monitoring and alarm (related processes and processing frameworks are not the focus of this article, please refer to the design of ELK and others).

The channel introduced above is ant’s self-built channel. In order to support the control needs of domestic major mobile phone manufacturers and support ant’s international standard access, MPS also supports huawei, Xiaomi, Meizu, OPPO, VIVO, FCM and Apple’s APNS and other channels push, and keeps the back-end business system transparent. This enables the service system to focus on completing service functions without paying attention to terminal models.

Next, we will look at several core business processes on the MPS server side.

1. Process for device connection, registration, and binding

The basic principle of MPS requires a stable TCP physical connection between the client and server. (Note: TCP connection maintenance is mainly guaranteed by heartbeat mechanism. How to ensure fast connection establishment and how to maintain connection stability will be introduced in specialized network optimization related articles later.)

So, every story starts with a TCP connection. After the TCP connection is established, the client network SDK starts to report some basic information, such as client product ID, version number, operating system, operating system version, model, etc. The purpose is to enable the back end to make more logical judgment and support multi-dimensional push. MPS will save a copy of connection information to the cache (Note: It can adapt to various types of cache according to the actual deployment environment: Redis, memcache or OCS, TAIR, Tbase in Ali system, etc.). In addition, it will also save a memory data in the access gateway to prepare for the subsequent push of the whole network.

For the terminal device that needs to be pushed by the third party, THE MPS client SDK will register the unique ID of the third-party channel according to the terminal type, and then call it to the MPS server through RPC. MPS will generate the unique ID of the device according to the basic information of the client. In addition, the relationship between third-party channel IDS and MPS devices is persisted to DB layer (note: currently, MySQL and OceanBase are mainly supported. In order to support hundreds of millions of users, the persistence layer must be divided into libraries and tables). So far, in principle, the message push based on device dimension and third-party channels of MPS can be satisfied.

Of course, businesses’ demands for elimination of push are generally based on business results and need to push notifications to specified users. Therefore, MPS must support the capability of routing push by user ID. Extending to the client, there needs to be a relationship binding process between users and devices, just like the user binding step in the figure above. After the process is completed, MPS has the basic conditions of pushing from various dimensions, waiting for service invocation.

2. Process of service invocation and message push

Currently, MPS supports two invocation modes:

  1. Business system interface call;
  2. MPaaS console page call.

General business processes are mainly triggered by interface invocation, while daily verification, group sending, broadcasting and other processes can be directly operated through the console. Generally, clients need to strictly control the content of client message notifications. Therefore, MPS provides the function of push template management, which can be operated on the mPaaS console. For example, an “Alipay receivables #amount# yuan” can be configured. In the template, only the amount variable is replaceable, everything else is fixed, and only the amount attribute is transmitted when the interface is called.

In this way, managers can effectively control and standardize the content of push messages. In addition, when the push content needs to be changed, only the template needs to be maintained, and the business system can push the notification according to the new message content without any changes. Therefore, a template rendering step is added to the server flow of MPS (voice broadcast template synchronization is supported, and the client can broadcast voice notification after language replacement according to the voice type message).

In many business scenarios, the user may not be online or on the network when the service occurs. Therefore, all messages are persisted in MPS. When a service occurs, MPS attempts to push it (through a third-party channel or a self-built TCP connection). In the self-built channel, the user terminal checks whether there is a TCP connection by querying the cache. If there is a TCP connection, the user pushes the message. After receiving the push message, the client sends an acknowledgement to the server, and the server updates the status of the message. If the push fails or the receipt is lost, the user will receive the notification again when establishing a connection next time, and the client will perform logical deduplication.

3. Multi-dimensional broadcast message notification push

The operation of client App often requires large-scale marketing activities. MPS message notification is an effective means to remind users of the start of activities and prompt them to open the App. At the same time, the control scope of network-wide notifications needs to be controlled according to specific rules (such as operating system type, model, version, etc.), so MPS also adds the support of multi-dimensional push.

Broadcast push can be divided into two modes: self-established channels and third-party channels.

  • Self-built channels. When the front end triggers broadcast tasks, after MPS encapsulates push content and rules required by business, the access gateway can traverse the connection list in memory and match specific dimensions to push it to the client (as shown in the figure: left-most process).

  • For the third-party channel mode, circular traversal binding relationship is used to obtain the third-party channel ID to push. For the App with hundreds of millions of users, rapid traversal of all users is the strongest support for activities. Therefore, MPS relies on distributed scheduling tasks to ensure that all services in the cluster participate in the push process: Distributed tasks currently use the 3+1 approach:

The first step:

In daily situations, the scheduling center will detect whether there are broadcast tasks to be processed according to a fixed frequency (crontab expression is supported). In order to avoid repeated triggering and redundant push, the detection process is single task, and the detection is carried out by a single machine (for example, step1: the MPS-B server is randomly scheduled).

The second step:

When the MPS-B server detects broadcast tasks to be processed, it divides tasks (number of fragments = Total number of users % Number of machines in the cluster % Number of users scheduled to execute each task), assigns task ids to each task, and returns the task model to the distributed scheduling center.

Step 3:

After receiving the fragmented task list, the distributed scheduling task center distributes the total tasks to each machine in the cluster in A balanced manner (MPS-a….) Mps-n) carries the task ID and the task model at the same time. After each server receives its own task, it starts to process it according to the attributes in the task model.

Step 4:

Finally, the message notification is called to the third party channel, and the rest of the work is handed over to the message center and client provided by the manufacturer.

At this point, the main processing process of MPS message push is basically introduced. As long as the code is piled up in accordance with this logic and the client side, the system can emerge at the right moment. Among them, adding proper strategy mode in the process of routing push and adding factory mode in the management of three-party channel push can also make the system better. Furthermore, the system needs to support the Logical Data Center (LDC) architecture and the deployment of multiple equipment rooms and multiple availability zones to meet the stability requirements such as DISASTER recovery. In such a deployment mode, the internal invocation relationship of the system and the logic of distributed scheduling task control need to be adjusted appropriately.

Through this section, I hope to introduce the infrastructure technology of MPS to you. If you have any questions, please leave a message on wechat background or log in to the specific document page of message push (t.cn/EtnB6Gu) to learn more.

Past reading

The opening | ant gold clothes mPaaS core components of a service system overview”

Summary of Ant Financial mPaaS Server Core Component System: Mobile API Gateway MGS

Core Components of Ant Financial mPaaS Server: Analysis of Mobile End-to-end Network Access Architecture under Hundred-million-level Concurrency

Analysis of Alipay App Construction Optimization: Optimizing Android Startup Performance through Package Rearrangement

“Alipay App construction optimization analysis: Android Package size extreme compression”

Follow our official account for first-hand mPaaS technology practices

Nail Group: Search group number “23124039” by nail group

Looking forward to your joining us