Author: Xianyu Technology — Ying Mo

The introduction

In the e-commerce system, there is an important concept, namely SPU and SKU of the commodity system, which directly determines the storage structure of the commodity in the system. SPU uses the shortest and most standard language to express a common commodity, and plays the role of an intermediate bridge in the cross-field and cross-channel transmission, so as to truly integrate commodity sales globally and online and offline. Today we will introduce the SPU system of idle fish.

The SPU overview

SPU conveys the key information of goods in the e-commerce system. It is the smallest unit of commodity information aggregation. Before introducing the SPU system, this paper briefly introduces the classification of the most important category features in the category system: key attribute, sales attribute, commodity attribute and binding attribute.

• Key attributes: They are called key attributes because they are used to constrain and define a product and are used to identify a unique product (SPU). For example, in the mobile category, the key attribute is brand, “Brand :Apple/ Apple; Model number :iPhone5S “uniquely identifies a product. • Sales attribute: Sales attribute is the attribute that can determine the buying and selling behavior. Simply understood as the pop-up page options before placing an order, such as iPhone5S, select white +32G to generate an order. • Commodity attributes: Commodity attributes are supplementary descriptions of the commodity in more detail. For example, whether the phone warranty, color and so on. • Bind attributes: Bind attributes are additions and refinements to key attributes, such as Brand :Apple/ Apple; Model :iPhone11 “+ purple + 256gb confirms its 6.1-inch screen size. Our protagonist SPU model, in ali’s system, is usually defined as: key attributes + bound attributes + common attributes.

Ali SPU status and idle fish SPU

So as an e-commerce giant, what is the current STATUS of the SPU system of Alibaba? SPU system has been quite mature and brought great value to Ali commodity system. During this evolution, many excellent models have been developed around SPUs for different platforms and business types. Since Ali SPU system has been very mature, why don’t Xianyu use the existing TAO SPU system and rebuild a system? Tao SPU system can achieve real-time inventory of newly issued commodities through multi-party cooperation, and it has become a huge data system. However, combined with the demands of idle fish business side, we are faced with the following problems:

1. Due to historical reasons such as joint construction of Amoy SPU, multi-party data were mixed together. After layer upon layer cleaning, more than 90% data were still unavailable. 2. Xianyu has its own business of card vouchers and rent, and has built its own mature category management system, attribute management system, category prediction and other systems. We hope that the products can be more suitable for our business. Therefore, it is necessary to maintain a self-owned SPU system. 3. Idle fish business needs to be mounted with service providers, so providing multiple copies of data (such as LV and Louis Vuitton) is not scientific and rigorous, and does not meet the follow-up verification and service opening process. 4. Idle Fish SPU hopes to uniquely define a product through attribute combination, so as to support multiple business scenarios in a more standardized, official and standard manner. 5. Xianyu hopes to get involved in operation and manage SPU data.

SPU data link construction

The requirements of idle fish SPU system construction and the problems to be solved mainly focus on the following points: 6. Free fish SPU and structured system open, core categories compatible with Amoy SPU. 7. For specific services, students on the business side can customize SPU attributes. 8. Key attributes of Idle Fish SPU are required to be unique in the whole table of concept products. Attribute values are named in international standards and alias is supported. 9. Support the horizontal expansion of the business side, including the expansion information of inspection label, search and release label, inspection items, etc. 10. Provide a visual platform for operation intervention, and release the operation and maintenance rights, but need to follow the standard approval process. The structuralization of Xianyu SPU mainly relies on the operation and management platform of Tyler category. Through the maintenance of platform relationship, the one-to-one mapping relationship between Xianyu and Taobao can be guaranteed. Meanwhile, the underlying data storage of SPU is consistent with the structured data of Xianyu, using the same set of attribute system. Idle fish SPU data system adopts standard SPU+ idle fish own SPU dual channel construction. Standard SPU data refers to data that is well defined, not easily ambiguous, recognized by the industry, and contains complete information, such as mobile phones. This kind of data, we use the offline tasks, taobao SPU library for cleaning for many times, under the acceptable degree of order of magnitude, manually selected, reuse taobao SPU dimensions of all the available information, including binding property, sales, product attributes, the SPU attribute information, pictures, etc, through offline data tasks, completion idle fish side information, Tyler platform feature label, Idle fish channel category, Taobao category, business label, status, business data, etc. At present, there are three status levels of SPU information, as follows: (Note: the fields and values here are virtual values)Xianyu has its own SPU business, which is promoted by various business parties to sort out the way, such as fashionable clothes, fashionable shoes and luxury goods. In most cases, key attributes are used to define SPU.Among them, the standard SPU data import process and classification are gradually advanced, and the specific process is as follows: 11. Clean the SPU data and remove the special characters /.. Test/special symbols and other dirty data, remove key information incomplete data. Data volume: 10000 -> 1000 12. The data volume is reduced by 30% by continuing to clean data based on keywords. 13. Operation intervention and manual selection. 14. Mark the business identifier and fill in the extended fields, such as BIZ (indicating whether a business is supported) and bizProperty(the extended property of the business based on SPU)15. Longgong SPU management system long-term operation and maintenance. Xianyu SPU system gets through the structured process, uses OpenSearch search engine as a whole to provide external query services, and uses ODPS->mySql->OpenSearch as the data link. The problems solved by data link are as follows:

•ODPS periodic task, T+1 to supplement comprehensive information, including but not limited to SPU information, product load, category level, etc. •mySql easily implements ID increments and maintains its own set of spu_id. •mySql->OpenSearch can automatically update data in real time without API push/scheduled task/manual engine rebuild. •OpenSearch can achieve flexible index conditions, at the same time a good implementation of fuzzy search, relevance sorting, sales sorting. •OpenSearch unique key constraint ensures that the same product is unique across the table and the constraint key uses the combination of attribute values VID. Complete SPU data contains basic information such as SPU attributes, binding attributes, sales attributes, images, and titles. Only SPU cannot meet the business needs of Xianyu. On this basis, we expand taobao category, Xianyu channel category, business identification, business attribute, business exclusive check item and xianyu platform release amount (according to the DIMENSION of SPU), leaving room for expansion of system business.In order to facilitate operation intervention and later data maintenance, we also designed a set of management system for use together, which can realize basic single addition and batch addition, multi-dimensional query analysis, modification and deletion. Parts involving online data changes, such as editing/deleting, will be uniformly connected to Changefree (Safety Production Approval process). As shown in the following figure, the platform provides long-term maintenance capabilities. SPU serves a variety of businesses, but not for a particular business. We are committed to creating a set of underlying basic capabilities that can horizontally support the commodity system to get through the structured multiple idle fish businesses and serve the commodity understanding as a general basic capability.

The use of idle fish SPU in business scenarios

The SPU system has supported a number of businesses at present. It is expected that in the follow-up cooperation with various industries and business parties, more emphasis will be placed on reducing the cost of data output, strengthening close cooperation with various industries and filling the SPU data pool. At present, Xianyu SPU mainly supports several scenes, including: inspection treasure, SPU search and release, wordless purchase. Inspection treasure: Inspection Treasure is a business platform that Xianyu cooperates with service providers from all walks of life to enhance user trust and provide high-quality goods. Xianyu is gradually expanding the categories to support inspection. SPU scenario In this service, it is used to check whether the current input supports the inspection. At the same time, the service side needs to verify the integrity of the inspection items to ensure the normal start of the service. The SPU system escorts the start of the inspection treasure service. The application of SPU basic capabilities can well involve business in the main issuance process, and bring 10,000 new transactions to the platform and business parties every day. SPU search publisher: This is a new publishing scene of Xianyu, which searches for the same product through SPU information matching, thus reducing the cost of user publishing and promoting the overall growth of publishing. SPU plays the role of “same product” in this scenario, and the amount of SPU data directly determines the availability and overall user experience of this scenario. The SPU scenario not only reduces the cost of publishing, but also provides better structured information. At the same time, it also covers new users and low active users, which is of certain significance to expand the user width. In the wordless purchase project, SPU also acts as a “product” to provide services to the outside world.