This paper introduces how vivo application store recommendation system can efficiently support personalized recommendation requirements.

One, foreword

The application data of the store mainly come from the channels of operation schedule, CPD, games, algorithms and so on. After the establishment of the recommendation project, there is no change. What has changed is that the recommendation system is responsible for the connection with the data source, and the store server only needs to connect with the application recommendation system.

If readers think that we simply copy the store server code to the recommendation system, it is really too young too simple, without optimization or upgrade directly copy a system is impossible, this life is not possible. Below I will introduce how we design and plan the application recommendation system.

Ii. Challenges

In the author’s opinion, store application recommendation system should not only have high performance, high availability and monitoring ability of core indicators, but also have a core ability to efficiently support store traffic scenarios to access personalized recommendation.

How do you define efficient support?

  • At the very least, it can support three or four parallel requirements at the same time.
  • A requirement development cycle should not exceed at least 2 days.
  • Keep the bugs low, no more than 2 per scene on average.
  • The normal needs of students can be quickly supported.

Share our application recommendation planning case to see:

In scenario XX,

If the main application A belongs to the application class,

  • Q1 queue is first fetched from the x1 data source.
  • We then fetch the Q2 queue from the X2 data source.
  • Then, Q2 queue is used to truncate Q1 queue. After intersection, co-developer filtering and first-level classification filtering are carried out.
  • If the intersection is empty, Q2 is used to fill the bottom, and then the elements at n1 and n2 of the intersection queue are taken as the return queue.
  • If no data has been obtained before, n of the categories with the highest click-through rate should be selected from the big data XXX table according to the probability of application clicks under the main application, and these data should be filtered by the same developers in the queue.

If main application A is in the game category,

  • The XXXX
  • Carry out secondary classification filtering
  • And if there’s not enough of it, you take the data from x(n) and you process it,
  • If you have less than 3 apps, you need to take the apps in the same category from the weekly list and rank them by download.

That’s right, readers, don’t doubt yourself, in order not to confuse you too much, we’ve just picked a simple requirement. It’s not a big deal to implement such a feature, but when there are dozens of personalized recommendation requirements, will it be a panic? Now, take a brief look at some of our requirements for personalized recommendation, as shown in Figure (1) :

Figure (a)

Using the case by Case development scheme before the store server, it was not able to achieve the efficient access to the recommendation scenario described above in any way. Then it was up to us to implement the optimization process.

Three, how to solve

In order to better explain the solution, we start from the actual thinking process, step by step to explain the problem solving process.

3.1 Business process abstraction

From a purely planning point of view, we need to do at least a few things as shown in figure 2 for each scene:

Figure (2)

  • ** Get recommendation list: ** Call the recommendation queue obtained from each data source (note that the interface used in different scenarios may not be the same, and the fields and structures returned by the interface may be different).
  • ** Queue fusion: ** intersections or consolidates the operations mentioned in 1.
  • ** Data filtering (in-queue/inter-queue) : ** Filters items in queues, mainly to improve correlation.
  • ** Data bottom: ** refers to the use of list bottom when queue data is insufficient, which may take the same level of classification data of weekly list data, the same level of classification data.

Based on the convenience of development, the author further adjusted the model, and the adjusted model is shown in Figure (3).

(3)

After obtaining the queue, the queue installation filtering and in-queue filtering (such as master application and developer filtering) can be combined for the following reasons

  • It is convenient to define filtering policies for each data source. In actual situations, different queues will use different filtering policies.
  • This approach perfectly matches the template design pattern and ensures that our process of getting the list of recommendations is consistent and stable.

3.2 Abstract process extension

At figure 3, the reader will find that we still haven’t solved the differentiation process in the various recommended scenarios we mentioned earlier.

In fact, after working with several requirements, we found that it was almost impossible to solve such a large difference in one set of code, or even if it was implemented, it would make the code extremely complex. Instead, let’s face it, let the differences be realized in the scene plugin, and let’s put more effort into the trunk.

In order to support the scene to have flexible expansion ability, the author added four links based on Figure (3) :

  • Intra-thread sharing of queue results: ** Implemented using ThreadLocal. The storage of the results of each recommendation queue is mainly for the convenience of using a certain recommendation queue for filling in the future. In addition, it is to avoid the need to repeatedly request the three-party data interface and reduce the repeated invocation of the interface.
  • ** Plug-in queue bottom: ** The main purpose is to use the specified queue to complete the filling in the case of insufficient quantity after filtering. The scene plug-in can also fill in the filling logic as needed to realize the complement of queue content.
  • ** Plug-in interface callback: ** This link is mainly to do personalized processing to the previous queue, such as queue intervention, etc. The main reason why plug-in interface callback and plug-in queue bottom are not fused together is that plug-in queue fusion can achieve configurable Settings.
  • ** Weekly list bottom: ** provides universal weekly list data query capability, supporting query by various dimensions, this part of the data as the last bottom of the queue.

The expanded flow chart is shown in Figure 4

Figure (4)

3.3 Overall logical block diagram

Through the above analysis, we can realize the personalized scene content as much as possible in the plug-in layer, and the framework layer is responsible for loading the specific personalized recommendation logic of the scene plug-in.

The system is divided into plug-in layer, framework layer, protocol adaptation layer, data source service layer, atomic service layer and basic service layer from top to bottom. The upper layer relies on the lower service (interface) through SDK. The responsibilities of each layer are as follows:

  • ** Plug-in layer: ** Plug-ins corresponding to each scenario. The framework layer provides the default implementation of plug-in callback or extension interface, and the plug-in layer implements the specific logic as needed.
  • ** Framework layer: ** defines the core flow and execution logic of the recommendation data, and the extension and callback interfaces implemented by the callback plug-in layer.
  • ** Protocol adaptation layer: ** is responsible for finding the data source service corresponding to the scene according to the scene, encapsulating the conversion protocol and carrying out data conversion.
  • ** Data source service layer: ** RPC service encapsulation layer provided with various queue providers.
  • ** Atomic service layer: ** Filtering types of related services, mainly RPC services that depend on stores, can be composed using a composite design pattern.
  • ** Basic service layer: ** Supports correlation judgment or filtering by developer, level 1 category, level 2 category, application type, etc. Like atomic service layer, this layer of services is atomic granularity and supports composite control.

At this point, as I’m sure you all know, our development efforts will ultimately focus on developing scenario plug-ins for personalized recommendations, rather than developing every single business process.

Apply recommender architecture

3.4 Key Implementation

After the completion of the third step of the overall logical block diagram design, we carried out relevant scheme research from the scene parameter definition, service design principles, design mode use, scene hot plug and other aspects and finally realized the implementation of the scheme.

3.4.1 Defining Scenario Service Parameters

In order to make the recommendation scenario universal enough, we map the content of the data source layer, atomic service layer and basic service layer for service configuration, and define corresponding configuration items in the configuration to realize the mapping and combination of services, and implement the plug-in layer for the different content. The configuration items are illustrated as follows:

  • **sourceMap: ** Scenario service is defined as map to support multiple modules or experimental groups in the scenario, where key is the module ID, which the store server needs to carry when requesting recommendations.
  • **cpdRequest, algorithmRequest, gameRequest: ** Used to define request parameters for the corresponding RPC call.
  • **filterRequest: ** Used to define the filtering requests in the queue, such as host application and developer filtering, etc.
  • **unionStrategy: ** Defines queue merge and merge and rules for merging between queues.
  • Supplement:
  • **sourceList: ** Data sources used, if two data sources are defined in the figure above, then data needs to be fetched from both data sources for queue merging and post-processing in this scenario.

3.4.2 Atomization and uniqueness of services

The realization of service atomization and service uniqueness is very important to the system, in the implementation process is strictly in accordance with the following two points:

The tripartite RPC services on which application recommendations depend and some of the filtering logic inside are packaged into an SDK of fine-grained atomic services (methods). The content in SDK does not contain the specific business capability of personalized recommendation scenes, but focuses on basic function items. Business contents need to be realized in scene plug-ins, and unified types of services can be combined as far as possible.

Service uniqueness is critical to achieving system convergence and code size control, and we are constantly working towards this. Each service layer provides related functions externally in the form of SDK, in which the uniqueness of service invocation entry is realized.

3.4.3 Rational use of design patterns

Many design modes are used in the system to optimize the overall architecture. The following focuses on the template design mode, strategy mode and combination mode:

The template design pattern and policy pattern are used to achieve this process in getting the recommendation raw queue.

The benefits of using a template design pattern are obvious, as it is easy to streamline this part of the process.

Different data source services and methods need to be used for different data sources. The advantage of using a policy pattern is that it is easy to define calls to different interfaces in different scenarios.

As far as possible, atomic services or methods of the same type support composite patterns, which provide great convenience for subsequent extensions.

To illustrate the actual implementation method, when we define the filter type, we can pass in multiple filter types, and the upper business can pass in as needed when using. Using composite design patterns makes a huge difference in scalability.

3.4.4 Hot Swap scenarios

In order to realize the isolation and non-interference between scenes in the system, the author uses the Way of Java SPI to define the scene interface in the framework layer, and the interface implementation class is implemented in each scene in a separate Jar. This approach helps minimize the intrusion of plug-ins into the framework layer and the underlying service layer.

4. Changes brought about

In the past, the store server wrote complete recommendation queue acquisition, fusion, assembly and filtering logic in the service layer of each interface, with a large number of repetitive content, and with the continuous iteration of the version, there are many versions of different processing logic mixed together, resulting in difficult transformation and upgrading, leading to a whole body. At present, the application of recommendation system has brought great improvement in two directions:

  1. Completely abstract and independent process framework of logic, the various business scenarios only need according to the need to write a few plug-ins callback logic, (don’t involve very special scenario can completely don’t have to write a plugin callback extension, through configuration rule configuration can be corresponding to the scene, can be fully realized free development, currently has about 30% of the scene from development).
  2. Scenarios are isolated and independent. Complex function upgrades can be implemented incrementally by using the scenario ID or module ID corresponding to the upgrade, without affecting the existing logic.

Write at the end

Through the implementation of the above related solutions, we reduced the development workload by about 75% for each recommended scenario, and the bug rate was also significantly reduced.

Author: Vivo -Huang Xiaoqun