Guests | Peng Changping

Graduated from Institute of Automation, Chinese Academy of Sciences. He has more than ten years of frontier exploration and industrial practical experience in the field of machine learning and recommendation system. He has published many papers on international academic conferences on recommendation system such as RecSys and CIKM, and is currently the head of recommendation advertising algorithm in JINGdong.

Since the development of the Internet, recommendation system is ubiquitous and has become the revenue engine of many e-commerce platforms. Jingdong’s personalized recommendation system has also brought great benefits to the company. With the recommendation system playing an increasingly important role in information distribution, we are also exploring how large-scale machine learning, deep learning and other technologies are applied in jingdong’s commodity search and recommendation, as well as what conditions an efficient and valuable recommendation system should meet.

How do recommendation systems drive business growth

In the digital information age, recommendation system has become the standard technology of TOC Internet products, and recommendation algorithm also plays a crucial role in the improvement of business benefits. Platforms such as Amazon and Netflix will obtain huge commercial value through recommendation system. According to statistics, recommendation system can generate over $1 billion of commercial value for Netflix every year, and about 40% of Amazon’s revenue comes from personalized recommendation system.

For e-commerce, personalized recommendation system can meet the massive demand of thousands of people. The essence of it is, in fact, in the case of user purchase intention is not clear, the use of machine learning or deep learning algorithm, combined with the feature of the user, product features and scenario to build to build the user interest model, and then from the vast amounts of goods found in the users interested in goods, shorten the distance of users to the goods, improving efficiency of users to buy and product experience. Peng Changping believes that personalized recommendation is an effective distribution mechanism in the scenario of abundant candidates. He explained how jingdong’s recommendation system drives business growth from the perspective of product quantity and quality.

Goods on the first, quantity, electricity SKU far more than the human brain can handle orders of magnitude, such as “jam” in jingdong have the similar SKU, of Stanford university scholars did an experiment in offline supermarket, provide 24 kinds of group A jam taste, stay in front of the shelves of users have taken place in only 3% of the purchase, group B with 6 kinds of flavor jam, 30% of those who stopped at the shelf made A purchase, 10 times higher than group A. “Less Is More”, in the e-market scene with too many candidates, the personalized recommendation of “Shopping for someone” helps users select a small number of suitable choices.

Secondly, in terms of quality, personalized recommendation is based on the values of the platform. Jingdong recommendation system mainly promotes “good”, “economical” and “fast” products by integrating all the information of brand, attribute, price, evaluation and logistics of products. Therefore, while providing users with a better shopping experience, user engagement will also increase, thus forming a virtuous circle and bringing better revenue effects.

With the maturity of large-scale machine learning, deep learning and other technologies, they are more widely used in product recommendation. Peng Changping believes that at present, the recommendation system is the system with the most extensive, in-depth and successful application of machine learning algorithm in the industry. In almost every link, we are using data and algorithm-driven models to replace manual head beating.

Perhaps the most familiar use of deep learning techniques for recommendation systems is CTR and conversion rate estimation, but he offers several other examples: 1. Recall: It is difficult to solve all problems with one model. Therefore, JINGdong simultaneously uses vector-based, tree-based and graph-based deep learning models in recall. 2. Knowledge map of goods, understanding of text, picture and video of goods and the relationship between goods, almost completely rely on NLP, CV and other machine learning algorithms; Rerank is recommended as a multi-objective optimization problem. Rerank is needed on the basis of the estimated click rate to improve user experience and browsing depth. Session global optimization guides users to continuously pull down business scenarios, which is a good match for deep reinforcement learning.

What are the characteristics of a good recommendation system?

Due to the differences in user groups, business scenarios, regions and cultures, the recommendation system has thousands of faces. Among the numerous details, the recommendation system of different platforms is also different. Peng changping said that compared with video, news, live broadcast and other media content platforms, the recommendation system of JD e-commerce is easier to achieve 60 points, but it is difficult to achieve 80 or 90 points.

From the perspective of the framework, recommendation systems are all engaged in User understanding, Item understanding and matching, and all systems have links such as product selection, recall, click rate estimation, Rerank and so on. However, the difficulty of e-commerce recommendation lies in the following three aspects:

First, from the User side, content information platform, users’ needs are relatively unchanged for a long time, content consumption process is completed online. Online is just a transaction process. Offline process is difficult to track and digitize. The e-mall scene is a great challenge to the identification and stimulation of user needs.

Second, from the perspective of Item, content producers of content information platforms can update content day after day with different patterns centering on the same interest theme. In the shopping scenario, if users have already purchased, similar products cannot be recommended any more, and the demand for expanding and stimulating users is higher.

Thirdly, from the Action that the recommendation system expects users to take, the content information platform mainly meets the entertainment needs of users, and the cost of consuming unreasonable recommendation information is very low. In the shopping scenario, the recommendation system expects users to click and browse, and the optimization goal of the system is to let users plant grass or even buy grass. If the Item quality is poor or the recommendation accuracy is not enough, users will abandon the recommendation function of the platform, or even lose from the platform.

So, what are the characteristics of an efficient and valuable recommendation system? Peng Changping believes that it is a good recommendation system to distribute users’ favorite items to them in the case that users do not actively express their needs. Such a system needs to meet the following three conditions:

First, is to meet the needs of users, reflected in the user is willing to see, stay for a long time;

Second, there is growth, reflected in the expansion of user interest, can drive the growth of high-quality goods or content providers, friendly to new users or new businesses;

Thirdly, it reflects the values of the platform. The recommendation system promotes the survival of the fittest among platform players.

To achieve these three points, the recommendation system needs to do the following work: 1. Learn from User behavior feedback and Item information to make the model adaptively match based on data; 2. 2. There is no Silver Bullet in recall, so it needs to use different types of algorithms to do recall. Models at each stage should have strong generalization ability, and make customized optimization for cold-start User and Item. 3. Most of the optimization objective functions reflecting the values of the platform are multi-objective optimization.

Application practice of e-commerce recommendation system

A recommendation system is an information filtering system used to predict a user’s “rating” or “preference” for an item, with the goal of generating meaningful recommendations for items or content that the user is interested in. On the Internet, which is full of massive information and data, without recommendation system, it is like looking for a needle in a haystack for users to obtain valuable content. Recommendation system can search a large number of dynamically generated information, provide personalized content and services for users, and effectively solve the problem of information overload. With the explosion of digital information and Internet visitors, recommendation systems are more important than ever.

The development of THE recommendation system of JD has gone through the following four stages:

First, to meet user needs stage. In terms of meeting customer needs, the earliest system was modified from the search system, which understood the recently browsed goods as users’ needs. Item-based CF is the most important recall method.

Second, expand user demand stage. At this stage, both data and algorithm are used to improve the richness of recall from as many angles as possible. Therefore, JINGdong set up a project called “Recall Kaleidoscope” to continuously improve the diversity and coverage of recall. In the sorting process, the optimization objective changes from emphasizing click through rate and conversion rate matching to optimizing user pull-down depth, novelty and diversity.

Iii. Session global optimization and merchant ecological optimization. After entering this stage, JD’s optimization focuses on Rerank, which regards the pre-order browsing behavior of users in the Session as a complete List. Rerank sorting is a process of List generation and List evaluation, that is, optimizing the page views and clicks of List users as a whole. The other direction is to introduce ecological optimization mechanism. The model quantifies the long-term value to users and businesses of a single interaction between users and goods, and introduces the estimated quantitative value into the ranking mechanism.

Fourth, cross-user group and cross-business group joint optimization stage. Along with the development of jingdong business, covering the user groups from relatively single group expand to a diverse group of three to six cities has accounted for more than sixty percent of users, whether jingdong within the App, or custom designed for sinking market jingdong extreme edition, Beijing xi, users of expansion, and the rapid growth of new custom App, It presents a greater challenge for the recommendation algorithm of thousands of faces. Technology such as commodity knowledge graph and transfer learning played an important role in this stage.

In different periods, JINGdong recommendation system has also made great efforts to improve the accuracy, precision and coverage of recommendations. Peng Changping said that in order to simultaneously improve several seemingly contradictory optimization objectives of recommendation system, it is necessary to start from three dimensions: the diversification of recall algorithm, the transformation from user-item Pair optimization to session-level global optimization, and the ecological optimization to protect the growth of high-quality merchants. Jingdong has done the following work from the three perspectives: 1. Kaleidoscope of recall: In terms of recall granularity, we have established hierarchical representations of User and Item with different granularity to match them. In terms of recall algorithm, Boolean Matching Model, Embedding-based Retrieval and Knowledge-based Retrieval all account for a large proportion in our recommendation results. 2. Session global optimization: From the perspective of a single recommendation candidate, there is a contradiction between the accuracy and surprise degree, but the two are unified from the perspective of maximizing the overall clicks of the Session, that is, CTR model has changed from Pointwise to Listwise. 3. Ecological optimization of merchants: The quality classification and cold start mechanism of new merchants and new products effectively guarantee the exposure and order volume of high-quality parts on the platform. The constant influx of new merchants and product launches is an important driver of increased coverage and surprise.

According to Peng Changping, there are many sub-scenes on jingdong platform, and each sub-scene has a lot of subdivision search and recommendation, for the joint optimization of these sub-scenes recommendation, the most important use is the transfer learning algorithm. The user behavior of each sub-scenario is inadequate, but each scenario has its own unique user behavior pattern. Jingdong uses the data of the main scene and multiple sub-scenes jointly for model training, and designs a set of multi-layer network structure so that the model can transfer knowledge not only from the main scene, but also from similar sub-scenes. A single model of the sub-scene can be constructed through transfer learning, which can be simultaneously applied to multiple terminals such as JINGdong App, Jingxi App, JINGdong Speededition App, wechat Shopping, QQ shopping and so on.

In the increasingly fierce competition among e-commerce platforms, how to attract more new users and increase the activity of old users and the stickiness of the platform is a key factor affecting the development of the platform. Therefore, the continuous iteration and upgrade of the recommendation system is particularly important. In the future, the recommendation system of JD will also be optimized in the three technical directions of content recommendation of shopping guide, scene recommendation and ecological optimization mechanism.

Buy from class content recommendation, as represented by live take goods electricity content, jingdong platform has accumulated mass content producers, their quality of production goods and goods together become a recommendation system of content Item, the different types of material, different optimization objectives, the algorithm proposes a bigger challenge, Richer content also gives users a better browsing and buying experience.

In terms of scenario-based recommendation, when it comes to the experience of “shopping”, many people are deeply impressed by the scenario-based layout of ikea stores. Jingdong is developing the full set of goods required by the consumer scene based on the user’s understanding of the consumption scene, and presenting it to the user in a more three-dimensional way to provide online scenario-based shopping experience.

Finally, from the perspective of ecological optimization mechanism, what needs to be done in the future is to strengthen the survival of the fittest mechanism of merchants in the recommendation system and the growth mechanism of high-quality new merchants and new goods.

Technical problems and breakthroughs

Although the recommendation system has alleviated the problem of information overload to a large extent and satisfied the personalized needs of users, there are still some problems hindering the development of the recommendation system. Peng Changping believes that the biggest difficulty is the problem of “data”. It is embodied in two points: first, how to comprehensively obtain and rapidly process data; Second, how models can learn more efficiently from large amounts of data.

Then, in solving the problem of comprehensive access and fast processing of data, we should first figure out how to solve the problem of “comprehensive” and “fast” respectively. “Comprehensive” refers to the integration of online and offline all-channel data of every contact point interacting with users. “Fast” requires a quasi-real-time streaming data processing mechanism to improve the timeliness of data to the model and model parameter update. With the diversification of IoT terminals and the improvement of terminal computing capacity, the combination of terminal computing and cloud computing can further improve the timely response of recommendation system to user feedback.

Facing huge amounts of complex data, we should not only improve numerical model system of absolute power, system processing complex model absolute and terabytes of data service, also want to improve model adaptation degree of huge amounts of data structure, the scope of the latter issue, Peng Changping said favored AutoML technology maturity, such as we are now on the NAS network structure search, The result has been on a par with the model structure of professional algorithm engineers tuned for a long time, and in the near future, we believe we can replace the alchemists tuning the model structure.

Peng Changping believes that:

Industry recommendation system, there is no single core technology. In the recommendation system, algorithms are dominant and people are relatively passive. Both users and businesses have low tolerance for algorithm errors. Only when the system collects as complete and timely data as possible, adopts more efficient algorithms and polished every detail, can users and businesses trust the recommendation system. With the progress of technology, clothing, food, housing, travel, entertainment, every field will enter the state of oversupply. It can be predicted that with the popularization of 5G and IoT, people and electronic devices will increasingly rely on recommendation technology. It is not even a platform-level recommendation system, but everyone needs a personalized recommendation “assistant” in every field.

Live preview

If you are still not satisfied with the above content and want to communicate directly with Teacher Peng Changping, then, the opportunity is here!

Next Monday evening at 20:00 (September 7), Peng Changping will attend the InfoQ online open class to share the application and practice of JINGdong e-commerce recommendation System. If you are interested in the development of users’ interests in the e-commerce scene, you must come to see it!

Now click on the link to book the live stream!