One, foreword

With the rapid growth of the number of users, the single architecture of VIVO official mall V1.0 has gradually exposed its drawbacks: increasingly bloated modules, low development efficiency, performance bottlenecks and system maintenance difficulties.

The V2.0 architecture upgrade, which was launched in 2017, carries out vertical system physical separation based on service modules. The business lines are separated to perform their respective functions and provide servicification capabilities to jointly support the master service.

Commodity module is the core of the whole link, the increase of modules seriously affects the performance of the system, so the transformation of service is imperative.

This paper will introduce the problems and solutions encountered in the construction of vivo mall commodity system, and share the experience of architecture design.

Ii. Evolution of commodity system

The commodity module will be separated from the mall and become an independent commodity system, gradually developing to the bottom, providing basic standardized services for the mall, search, membership, marketing and so on.

The commodity system architecture diagram is as follows:

In the early stage, the commodity system was chaotic and contained many business modules, such as commodity activity business, second kill business and inventory management. With the continuous development of business, the commodity system carried more business, which was not conducive to the expansion and maintenance of the system.

Therefore, consider gradually sinking commodity business as the lowest and most basic business system, and provide high-performance services for many callers. The upgrade history of commodity system is introduced below.

2.1 Stripping of commodity activities and gifts

With the increasing number of commodity activities, there are various ways of playing, and additional attributes related to activities also increase correspondingly. These are not strongly associated with commodity information, but more inclined to user marketing, and should not be coupled with the core commodity business, so they are merged into the mall promotion system.

Gifts are not only mobile phones and accessories, but also points and members, which are not suitable for the commodity system and do not belong to the commodity module. Therefore, they are simultaneously merged into the mall promotion system.

2.2 seckill independence

As we all know, seckill is characterized by:

  • Time limit: The time range is short and ends after the set time

  • Limited quantity: The quantity of goods is very small, well below the actual stock

  • High volume: low price, can attract a lot of users

Based on the above features, it is not an overnight task to complete a seckilling activity. Due to the sharing of system resources, the sudden impact of large traffic will cause other businesses of the commodity system to refuse services, which will cause the risk of blocking the core transaction link. Therefore, it is independent as a separate seckilling system to provide services independently.

2.3 Establishment of commission system

The main products sold in our mall are mobile phones and mobile phone accessories, and there are only a few categories of goods. In order to solve the problem of not rich non-mobile phone products, the operation considers to cooperate with well-known e-commerce companies, hoping to introduce more categories of goods.

In order to facilitate subsequent expansion and make the original system non-invasive, we have considered creating a special independent subsystem for undertaking commission sales business. Finally, we hope to build a complete platform, and let other e-commerce providers take the initiative to access our business by providing open API.

2.4 Inventory Stripping

Pain points of inventory management:

  • Because our inventory is all to the commodity dimension, only one field identifies the quantity, each time we edit the commodity, we need to adjust the inventory for the commodity, so inventory management cannot be realized dynamically.

  • At the same time, the marketing system also has its own activity inventory management mechanism, the entrance is scattered, the relevance is weak;

  • Available inventory and active inventory management are based on actual inventory, resulting in easy allocation errors.

Based on the above pain points, in order to facilitate the operation and management of inventory and lay a foundation for future sales with actual inventory, we set up an inventory center, which provides the following main functions:

  • Real-time synchronization with ecMS actual inventory;

  • According to the actual warehouse distribution of inventory, the expected delivery warehouse and delivery time of goods can be calculated, so as to calculate the expected delivery time of goods;

  • Complete low inventory warning, can calculate according to available inventory, average monthly sales, etc., and dynamically remind operation order.

Three, challenge

As the lowest level system, the main challenge is to have stability, high performance, data consistency.

3.1 stability

  • Avoid single machine bottleneck: select the appropriate number of nodes according to the pressure test, do not waste, but also ensure communication, can deal with sudden traffic.

  • Service traffic limiting degradation: Limits traffic on core interfaces to ensure system availability first. When the system is overloaded with traffic, non-core services are degraded to ensure core services first.

  • Set a proper timeout period: Set a proper timeout period for accessing Redis and databases to prevent application threads from being occupied when traffic is heavy.

  • Monitoring & Alarm: Log standardization, and access to the company’s log monitoring and alarm platform to proactively discover problems and timely.

  • Fuse: The external interface is fused to prevent the system from being affected due to the abnormal external interface.

3.2 the high performance

Multistage cache

In order to improve the query speed and reduce the pressure on the database, we adopt multi-level caching, interface access hotspot cache component, dynamic detection of hotspot data, if the hotspot is directly obtained from the local, if not from Redis.

Reading and writing separation

The database adopts the read-write separation architecture, the master library is responsible for the update operation, the slave library is responsible for the query operation.

The interface current limiting

After the traffic limiting component is added to the system, the interface that directly operates the database can perform traffic limiting to prevent the database pressure from increasing due to sudden traffic or non-standard calls, affecting other interfaces.

But there were some early stumbles:

1. There are too many Redis keys caused by commodity list query, leading to the risk of insufficient redis memory

Because it is a list query, the input parameter is hashed to obtain the unique key during cache. Because there are many input items, the input parameter changes at any time in some scenarios. Based on the permutation and combination, basically every request will be returned to the source and then cached, which may cause database denial of service or Redis memory overflow.

Scheme 1: Loop into the parameter list, get data from Redis each time, and return;

This solution solves the problem of running out of memory caused by too many keys, but it obviously increases the number of network interactions. If you have dozens of keys, you can imagine that this will have a significant impact on performance. What other ways can you reduce network interactions?

Scheme 2: We enhanced the original Redis component. Since the Redis cluster mode does not support MGET, we implemented it by pipeline. Firstly, the slot where it is located is calculated according to the key, and then the one-time submission is aggregated, so that each commodity data only needs to be cached once. At the same time, the use of MGET also greatly improves the query speed.

This not only solves the problem of too many key values, but also solves the problem of multiple network interactions in scheme 1. By comparison of pressure test, the performance of Scheme 2 is improved by more than 50% compared with scheme 1. The more keys, the more obvious the effect.

2, hot data, resulting in redis single machine bottleneck

There is often a new product release conference in the mall. After the conference, it will directly jump to the detailed page of the new product. At this time, the detailed page of the new product will have a particularly large and sudden flow and single data, which leads to the unbalanced load of Redis nodes, some of which are less than 10%, some of which are more than 90%, and some conventional capacity expansion is ineffective.

We have the following solutions to hot issues:

  • The hash of the key, which spreads the key among different nodes

  • Local cache is adopted

In the beginning, Caffeine implemented the local cache component based on open source. It automatically calculates the request volume locally and caches data when it reaches a certain threshold. The cache time varies according to different business scenarios, usually no more than 15 seconds, mainly to solve the problem of hot data.

Later, it was replaced by the hotspot cache component developed by ourselves, supporting hotspot dynamic detection, hotspot reporting, cluster broadcasting and other functions.

3.3 Data Consistency

1, for Redis data consistency is easier to solve, use “Cache Aside Pattern” :

For read requests, read cache is used first. If a read request is hit, it is returned directly. If a read request is not hit, it is cached again. For write requests, the database is operated on before the cache is deleted.

2. As the inventory is stripped out, the maintenance entry is still in the commodity system, which leads to the existence of cross-library operations, and ordinary single-library transactions cannot be solved.

At first we use the method of exception catching, local transaction rollback, operation trouble spot, but also can solve the problem.

Later we completed the distributed transaction component through the open source SeATA, and introduced the basic components of the company by rewriting the code, which is now in use.

Four,

This paper mainly introduces the mall goods system how to break up, and slowly sinking to the most basic system, make the responsibility more single, can provide high performance of goods and services, and to share in the process of technical problems and solutions, the subsequent will have inventory system evolution history, distributed transaction related content, please look.

Author: Ju Changjiang, Development team of Vivo official website Mall