Seconds kill scenes

The most typical example is taobao, Jingdong and other e-commerce companies’ double 11 killing in a second. Hundreds of millions of users flood in a short time, and the instant traffic is huge (high concurrency). For example, 2 million people are ready to snap up a product at 12:00 in the morning, but the number of products is limited to 100. In this way, there are only 100 or less users who can actually buy the product, so it cannot be oversold

But from the business point of view, seckill activity is to hope more people to participate, that is, before the panic buying hope more and more people to buy goods, but after the panic buying time, when users start to place orders, seckill server back-end does not want millions of people to launch the panic buying request at the same time

We all know that the processing resources of the server are limited, so when the peak occurs, it is easy to cause the server to break down and users can not access the situation, which is just like the problem of morning peak and evening peak in travel. In order to solve this problem, the travel has a solution of off-peak limit

Similarly, online business scenarios such as seckilling require a similar solution to survive the problem of peak traffic caused by simultaneous buying. This is where peak traffic clipping comes in

Understand the seckill system

So, how can we better understand the seckill system? I think as a programmer, you first need to think in high dimensions and holistically. In my opinion, seckill mainly solves two problems, one is concurrent read, the other is concurrent write. The core optimization idea of concurrent reads is to minimize the number of users “reading” data on the server, or to let them read less data; The principle of concurrent write processing is the same, it requires us to separate a library at the database level, do special processing. In addition, we also need to do some protection against the second kill system, for unexpected situations to design a backstop scheme, to prevent the worst.

And from the perspective of an architect, to build and maintain a large flow, high performance and high availability system, concurrent, speaking, reading and writing on the route from the browser the user requests to the server we should follow several principles, is to ensure that the user requests the data as little as possible, the number of requests as far as possible, short path as far as possible, rely on less as far as possible, and don’t have a single point. These are key points that I will focus on in a later article.

In fact, the overall structure of the second kill can be summarized as “stable, accurate, fast” several key words.

The so-called “stability” means that the whole system architecture should meet the high availability, the flow must be stable when it meets expectations, and the chain cannot be dropped when it exceeds expectations. You must ensure the smooth completion of seckill activities, that is, the smooth sale of seckill products, which is the most basic premise.

Then is “accurate”, is second to kill 10 iPhone, that can only clinch a deal 10, more than a less a is not good. Once the inventory is not correct, the platform will bear the loss, so “accurate” is required to ensure the consistency of data.

Finally, “fast”, “fast” is actually easy to understand, it means that the performance of the system has to be high enough, otherwise how can you support such a large amount of traffic? It is not only the server that needs to be optimized for extreme performance, but also the entire request link that needs to be optimized for coordination. Faster everywhere, the whole system will be perfect.

Therefore, from the technical point of view, “stability, accuracy and fast” corresponds to the requirements of high availability, consistency and high performance of our architecture. Our column will also focus on these aspects, as follows:

  • High performance. Seckill involves a lot of concurrent reads and writes, so supporting high concurrent access is critical. This paper will focus on four aspects: dynamic and static separation scheme of design data, hot spot discovery and isolation, peak cutting and layered filtering of request, and extreme optimization of server side.
  • Consistency. The realization of goods inventory reduction in the second kill is also key. It is conceivable that a limited number of goods at the same time by many times the request at the same time to reduce inventory, inventory reduction is divided into “take down inventory”, “payment to reduce inventory” and withholding, etc., in the process of large concurrent update to ensure the accuracy of the data, its difficulty can be imagined. Therefore, I’m going to devote an article to how to design a split-inventory reduction scheme.
  • High availability. Although I have introduced a lot of extreme optimization ideas, there are always some situations that we can’t consider in reality. Therefore, to ensure the high availability and correctness of the system, we also need to design a PlanB to provide a cushion, so that we can still cope with the worst situation. At the end of this column, I’ll take you through the steps you can take to design a bottom-saving solution.

Architecture Principle “4 yes, 1 No”

Keep data to a minimum

  • By “as little data as possible,” we mean, first of all, that users request as little data as possible. The requested data consists of data uploaded to the system and data returned by the system to the user (usually a web page).
  • Why “keep data to a minimum”? Firstly, it takes time for the data to be transmitted over the network, and secondly, whether the data is requested or returned needs to be processed by the server. The server usually has to do compression and character encoding when writing the network, which is very CPU consuming. Therefore, reducing the amount of data transmitted can significantly reduce the CPU usage. For example, we can simplify the page size, remove unnecessary page decoration effects, and so on.
  • Secondly, “data should be as little as possible” also requires the system to rely on as little as possible, including the system to complete some business logic needs to read and save the data, these data and background services and databases are generally dealing with. Calling other services involves serialization and deserialization of data, which is a major CPU killer, as well as increasing latency. Also, the database itself can easily become a bottleneck, so the less you have to deal with the database the better, and the simpler and smaller the data, the better

Keep the number of requests to a minimum

  • When the page is returned, the browser renders the page with additional requests. For example, CSS/JavaScript, images, Ajax requests that the page relies on are defined as “additional requests” and should be kept to a minimum. This is because every request the browser makes is somewhat costly, such as a three-way handshake to establish a connection, page dependencies or connection number limits, and serial loading of some requests (such as JavaScript). In addition, if the domain name is not the same for different requests, the DNS resolution of these domain names is involved, which may take longer. So keep in mind that reducing the number of requests can significantly reduce resource consumption due to all of these factors.
  • For example, one of the most common practices to reduce the number of requests is to merge CSS and JavaScript files, combining multiple JavaScript files into a single file separated by commas in the URL (g.xxx.com/tm/xx-b/4.0… ?? The module – preview/index. XTPL. Js, the module – JHS/index. XTPL. Js, the module – focus/index. XTPL. Js). This is still a single file on the server side, but a component on the server side parses the URL and dynamically merges the files back together

Keep the path as short as possible

  • The so-called “path” is the number of intermediate nodes that the user needs to pass through in the process of making a request and returning data.
  • Typically, these nodes can be represented as either a system or a new Socket connection (for example, a proxy server simply creates a new Socket connection to forward requests). Each time a node passes, a new Socket connection is generally created.
  • However, each additional connection adds a new level of uncertainty. Statistically speaking, if a request goes through five nodes and each node is 99.9% available, then the availability of the entire request is 99.9% to the power of 5, which is approximately 99.5%.
  • Therefore, shortening the request path can not only increase availability, but also improve performance (fewer intermediate nodes can reduce serialization and deserialization of data) and reduce latency (can reduce network transfer time).
  • One way to shorten access paths is for multiple strongly dependent applications to be deployed together, turning remote procedure calls (RPCS) into method calls between JVMS. In my book, Technical Architecture Evolution and Performance Optimization for Large Web Sites, I have a chapter on the detailed implementation of this technique.

Rely as little as possible

  • The so-called dependency refers to the system or service that must be relied upon to complete a user request, and the dependency here refers to strong dependency.
  • For example, if you want to display a kill page, the page must rely heavily on product information, user information, and other non-essential information such as coupons, deal lists, etc. (weak dependencies), which can be removed in an emergency.
  • To reduce dependence, we can classify systems, such as level 0 system, level 1 system, level 2 system, level 3 system. If level 0 system is the most important system, then the system with strong dependence of level 0 system is also the most important system, and so on.
  • Note that level 0 systems should minimize their strong dependence on Level 1 systems to prevent important systems from being overwhelmed by less important ones. For example, if the payment system is a tier 0 system and the coupon is a tier 1 system, the coupon can be downgraded in extreme cases to prevent the payment system from being overwhelmed by the tier 1 system.

Don’t have a single point

  • A single point in a system can be said to be a taboo in system architecture, because a single point means that there is no backup and risks are not controllable. The most important principle in designing distributed systems is to “eliminate single points”.

Different architecture cases in different scenarios

As the volume of requests increases (from, say, 1W /s to 10W /s), this simple architecture quickly hits a bottleneck and requires architectural changes to improve system performance. These structural changes include:

  • The seckill system to create a separate system, so that targeted optimization can be done, for example, the independent system will reduce the shop decoration function, reduce the complexity of the page;
  • In the system deployment also do an independent machine cluster, so that the second kill large flow will not affect the normal commodity purchase cluster machine load;
  • Put hot data (such as inventory data) into a separate cache system to improve “read performance”;
  • Add seckill questions to prevent seckill from competing

However, the architecture still can not support more than 100W /s requests, so in order to further improve the performance of the seckill system, we made further upgrades to the architecture, such as:

  • Complete static and dynamic separation of the page, so that users do not need to refresh the entire page, but only need to click the grab the treasure button, so that the page refresh data to a minimum;
  • Local caching of seckill commodities on the server side does not need to call background services that depend on the system to obtain data, or even to query data in the public cache cluster, which can not only reduce system calls, but also avoid crushing the public cache cluster.
  • Add system current limiting protection to prevent the worst case

How to do the separation of movement and movement

What is static data

So what exactly is static separation? The so-called “static and dynamic separation” is actually the user request data (such as HTML pages) into “dynamic data” and “static data”.

Simply put, the main difference between “dynamic data” and “static data” is whether the output data in the page is related to URL, viewer, time and region, and whether it contains private data such as cookies. Such as:

  1. Many media sites, the content of an article whether you visit or I visit, it is the same. So it’s typically static data, but it’s a dynamic page.
  2. If we visit the home page of Taobao now, everyone may see a different page, taobao home page contains a lot of information recommended according to the characteristics of visitors, and these personalized data can be understood as dynamic data.

With an understanding of static data and dynamic data, I expect you can easily understand the context of the “static separation” solution. By separating static data, we can cache the separated static data. With caching, the “access efficiency” of static data is naturally improved.

So, how do you cache static data? I’ve summarized a few key points here:

  • First, you should cache static data as close to the user as possible. Static data is data that is relatively constant, so we can cache it. Where does the cache go? There are three common ones: in the user’s browser, on the CDN, or in the server’s Cache. You should cache them as close to the user as possible, depending on the situation.

  • Second, the static transformation is to cache HTTP connections directly. You’ve certainly heard of static changes to the system as opposed to regular data caching. As shown in the figure below, the Web proxy server directly fetches the corresponding HTTP response header and body according to the request URL and returns it directly. This response process is so simple that the HTTP protocol does not need to be reassembled, and even the HTTP request header does not need to be parsed.

  • Third, who caches static data is also important. Cache software written in different languages has different efficiency in processing cached data. In the case of Java, you can do caching directly on the Web server layer instead of the Java layer, because the Java system has its own weaknesses (such as not being good at handling large numbers of connection requests, consuming a lot of memory per connection, and the Servlet container is slow to parse HTTP). That way you can mask some of the weaknesses at the Java language level; Web servers (such as Nginx, Apache, Varnish) are also better at handling large concurrent static file requests

Static and static separation architecture scheme

After we separate the whole system, we will naturally think of a further solution, which is to move the Cache further forward to the CDN, because the CDN is closest to the user, so the effect will be better.

But in order to do so, there are several problems that need to be solved.

  • Failure problem. We also mentioned the cache aging problem earlier, I don’t know if you understand, let me explain again. When it comes to static data, THE key word I use is “relatively constant,” which translates to “subject to change.” An article, for example, stays the same now, but if you find a typo, will it change? If you have a long cache age, the client will see the wrong thing for a long time. Therefore, in this scheme, we also need to ensure that CDN can simultaneously invalidate the Cache distributed across the country within seconds, which has high requirements for the invalidation system of CDN.
  • Hit ratio problem. One of the most important metrics for a Cache is “high hit ratio”, otherwise the Cache would be meaningless. Similarly, if all data is placed in the CDN of the country, Cache fragmentation will inevitably lead to, and Cache fragmentation will reduce the probability of access requests hitting the same Cache, so the hit ratio becomes a problem.
  • Issue update issues. If a business system has daily business releases every week, the release system must be simple and efficient, and you need to consider the ease of quickly rolling back and troubleshooting problems when they occur

From the above analysis, it is not realistic to put the commodity details system on all CDN nodes nationwide, because there are problems of failure, hit ratio and system release and update. Is it possible to select several nodes to try to implement? The answer is “yes”, but such a node needs to meet several conditions:

  • Close to the area where the traffic is more concentrated;
  • Relatively far from the main station;
  • The network between nodes and master stations is good and stable.
  • The node capacity is large and does not occupy too many resources of other CDN
  • Don’t have too many nodes

Based on the above factors, level 2 Cache of CDN is suitable because level 2 Cache is smaller in quantity and larger in capacity. Users’ requests are sent to level 2 Cache of source CDN first. If the CDN fails, users can return to source site to obtain data.

Using CDN level 2 Cache as Cache can achieve a hit ratio similar to that of the current server static Cache. Because there are not many nodes, Cache is not very scattered, and the traffic volume is also concentrated, the hit ratio problem can be solved, and users can have the best access experience, which is an ideal CDN solution at present. In addition, the CDN deployment scheme also has the following characteristics:

  • Caching the entire page in the user’s browser;
  • CDN is also requested if the entire page is forced to refresh;
  • The actual valid request is just the user’s click on the “refresh grab treasure” button.

In this way, 90% of the static data is cached on the client or CDN. When the real second kill occurs, the user only needs to click the special “refresh grab treasure” button, instead of refreshing the whole page. In this way, the system only requests a small amount of valid data from the server, and does not need to repeatedly request a large amount of static data.

The dynamic data of seckill is less than the dynamic data of ordinary details page, and the performance is improved by more than 3 times. Therefore, the design idea of “treasure grab” allows us to request the latest dynamic data on the server without refreshing the page.

Handle hot data of the system

If you have billions and billions of products stored in your system, and tens of millions of products are accessed by hundreds of millions of users every day, then there must be some hot products that are accessed by a large number of users. These are what we call “hot products”.

Why focus

First of all, a hotspot request occupies a large amount of server processing resources. Although the hotspot may only account for one billionth of the total number of requests, it may occupy 90% of the server resources. If the hotspot request is still an invalid request with no value, it is a waste of system resources

What is “hot”

Hotspots are classified into hotspot operations and hotspot data.

The so-called “hot operations”, such as a large number of page refreshes, a large number of shopping cart additions, double 11 midnight large number of orders and so on, are among these operations. For the system, these operations can be abstracted as “read request” and “write request”. These two hot requests have different processing methods. The optimization space of read request is larger, while the bottleneck of write request is generally in the storage layer

“Hotspot data” is easy to understand. It is the data corresponding to the user’s hotspot request. Hotspot data is divided into static hotspot data and dynamic hotspot data.

  • Static hotspot data refers to hotspot data that can be predicted in advance. For example, we can screen out these hot commodities in advance through the way of seller registration and mark them through the registration system. In addition, we can also use big data analysis to find hot commodities in advance. For example, we can analyze historical transaction records and users’ shopping cart records to find out which commodities are likely to be more popular and better sold. These are all hot commodities that can be analyzed in advance
  • The so-called “dynamic hotspot data” is the hotspot that cannot be predicted in advance and is generated temporarily during the system running. For example, the seller advertised on Douyin, and then the product went viral, causing it to be bought in large quantities in a short time

Discovering Hotspot Data

Description Static hotspot data was discovered

Hot as previously said, static data can be through the business methods, such as forcing sellers to sign up for the ways ahead of the filtered hot commodities, is implemented by an operating system, to participate in the activity of commodity data marking, then through a backend system preprocessing of these hot commodities, such as cache in advance. However, this method of screening in advance through registration will also bring new problems, that is, increase the cost of sellers, and the real-time is poor, not too flexible.

  • However, in addition to pre-registration screening, you can also predict in advance through technical means, such as big data calculation of the products that buyers visit every day, and then calculate the TOP N products, we can consider these TOP N products as hot products.

Discover dynamic hotspot data

The realization of dynamic hot spot discovery system

  1. Build an asynchronous system, which can collect the hot Key of middleware products in each link of the transaction link, such as Nginx, cache, RPC service framework and other middleware (some middleware products themselves have hot statistics module).
  2. Establish a hotspot report and hotspot service delivery specification that can be subscribed based on demand. The main purpose is to transparentively transmit the hotspots discovered upstream to the downstream system based on the access time difference of various systems (including details, shopping cart, transaction, discount, inventory, and logistics) on the transaction link to protect the hotspots in advance. For example, for rush hours, the detail system is the first to know, and the Nginx module counts hot urls at the unified access layer.
  3. The hotspot data collected by the upstream system is sent to the hotspot service desk, and then the downstream system (such as the trading system) will know which items are frequently invoked and do hotspot protection.

Here I gave a figure, in which the user access to the route when there are a lot of goods, we mainly rely on guide in front page (including the home page, search page, product details, shopping cart, etc.) in advance, to identify which traffic is high, the goods by the middleware in these systems to collect data, hot and record to the log.

We use the Agent deployed on each machine to summarize the logs into the aggregation and analysis cluster, and then push the hot data in accordance with certain rules to the corresponding system through subscription distribution system. You can either populate the Cache with hot data, or you can push it directly into the memory of the application server, or you can intercept it, so the downstream system can subscribe to the data, and then decide what to do with it according to its own needs. Right

Traffic peak clipping

Peak clipping essentially means delaying user requests more, and filtering users’ access requirements layer by layer, and following the principle that the number of requests finally landing on the database should be as little as possible. There are three main operation ideas of peak traffic clipping (queuing, answering and filtering)

  1. Thinks of queue is the easiest solution is to use the message queue to buffer the instantaneous flow rate, the synchronization of direct call into asynchronous indirectly push, intermediate through a queue in the end to undertake the instantaneous flow rate of flood peak, at the other end will be pushed out smoothly, here, the message queue as reservoir, held the upstream flooding, cut into the downstream river channel flood peak flow, So as to achieve the purpose of reducing flood disaster
  2. The purpose of the question is to delay the request, play a role in the peak of the request flow, so that the system can better support the instantaneous peak flow
  3. Introduced in front of the queue and problem solving, either do the buffer when receives the request, either reduce the number of requests sent at the same time, and still have a kind of method for seconds kill scene, lami-nation filtration is the request, to filter out some invalid request, received a request from the Web layer, to the cache, the message queue, eventually to a database that is like a funnel, Filter and reduce the amount of data and requests as much as possible, layer by layer, so that the only valid requests end up at the end of the funnel (the database)

Several flow peak elimination schemes

Line up

The easiest solution to peak peaking is to use message queues to buffer instantaneous traffic, convert synchronous direct calls into asynchronous indirect push, and pass a queue to receive instantaneous peak traffic at one end and smoothly push messages out at the other end. Here, the message queue acts as a “reservoir” to hold the upstream flood and reduce the peak flow into the downstream channel, thus achieving the purpose of flood relief

However, if the peak traffic continues for a period of time and the maximum number of messages on the message queue is reached, for example, the local message backlog reaches the maximum storage space, the message queue will also be overwhelmed, which protects the downstream system, but is not much different from simply dropping the request. Just like when a flood breaks out, even if there is a reservoir, I’m afraid it will not help.

Answer the questions

If you remember, the early seconds were purely about refreshing the page and clicking the buy button, but it was later that questions were added. So why add the answer function?

The main purpose of this is to increase the complexity of the purchase, which serves two purposes.

The first purpose is to prevent some buyers from using seckill to cheat while participating in seckill. When seckill was very popular in 2011, seckill was also rampant, so it failed to achieve the purpose of national participation and marketing, so the system added questions to limit seckill. After the increase of questions, the ordering time was basically controlled in 2s, and the ordering ratio of seckill was also greatly reduced. The answer sheet is shown below.

The second purpose is to delay the requests and to whiten the peak of the request traffic so that the system can better support the instantaneous peak traffic. This important feature is to lengthen the peak order request from the previous less than 1s to 2s-10s. Thus, peak requests are sharded based on time. Sharding at this time is very important for the server to handle concurrency and greatly reduces stress. Also, because requests have a sequential nature, later requests arrive with no inventory and therefore never reach the final order step, so true concurrent writing is very limited. This design idea is very common at present, such as alipay’s “shoop shoop”, wechat’s “shake shake” are similar.

Here, I focus on the second kill answer design ideas.

Layered filter

The queuing and answering methods described above are either to send fewer requests or to buffer incoming requests, but another approach for the seckill scenario is to layer the requests to filter out invalid requests. Hierarchical filtering actually uses a “funnel” design to process requests, as shown in the figure below

The core idea of hierarchical filtering is to filter out as many invalid requests as possible at different levels, leaving valid requests at the end of the “funnel”. To achieve this effect, we must perform hierarchical verification of the data

The basic principles for layered verification are as follows:

  • The Cache of dynamically requested read data is used on the Web to filter out invalid data reads.
  • Do not perform strong consistency check on read data to reduce the bottleneck caused by consistency check.
  • Write data is segmented reasonably based on time to filter out invalid requests.
  • Traffic limiting is implemented for write requests to filter out the requests that exceed the system capacity.
  • Strong consistency check is performed on write data. Only the last valid data is retained

Layered check: is the purpose of reading in the system, to minimize the system bottleneck caused by the consistency checking, but as far as possible will not affect the performance of the check conditions in advance, such as whether the user has the qualification, item status is normal, kill the user whether the answer is correct, seconds kill is over, whether illegal request, marketing equivalent adequacy, etc.; In a write data system, a consistency check is performed on the written data (e.g., inventory), and the final accuracy of the data is ensured at the database level (e.g., inventory cannot be reduced to negative values).

SEC kill Practice (pure back-end design part)

  1. The user sends the request to the backend after the front-end verification
  2. Then check stock, deduct stock, create order
  3. Finally, data is landed and stored persistently

The preparatory work

The database

Drop database seckill; Create database seckill; -- Use seckill; DROP TABLE IF EXISTS 't_seckill_stock'; CREATE TABLE 't_seckill_stock' (' id 'int(11) unsigned NOT NULL AUTO_INCREMENT COMMENT' id ', 'name' varchar(50) NOT NULL DEFAULT 'OnePlus 7 Pro' COMMENT 'iD ',' count 'int(11) NOT NULL COMMENT' varchar ', 'sale' int(11) NOT NULL COMMENT 'buy ',' version 'int(11) NOT NULL COMMENT' buy ', Create_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'create_time ', 'update_time' TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'update_time ', PRIMARY KEY (' id ')) ENGINE=InnoDB DEFAULT CHARSET= UTf8MB4 COLLATE= UTf8MB4_bin COMMENT=' table '; INSERT INTO 't_seckill_stock' (' count ', 'sale ',' version ') VALUES ('10', '0', '0'); DROP TABLE IF EXISTS 't_seckill_stock_order'; CREATE TABLE `t_seckill_stock_order` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, 'stock_id' int(11) NOT NULL COMMENT 'ID',' name 'varchar(30) NOT NULL DEFAULT 'OnePlus 7 Pro' COMMENT' 表 名 ', 'create_time' TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'create_time ', PRIMARY KEY (' id ')) ENGINE=InnoDB DEFAULT CHARSET= UTf8MB4 COLLATE= UTf8MB4_bin COMMENT=' mysql ';Copy the code

The cache

Homebrew is installed in /usr/local/Cellar/ by default /usr/local/etc = /usr/local/etc Using the brew helps us start brew software services start redis / / two way of redis server/usr/local/etc/redis. Conf / / execute the following command redis server. - 4, check the redis service process ps axu | grep redis 5, redis - redis service/cli connection/redis the default port number 6379 * * * *, the default auth * * * * is empty, To connect to redis-cli -h 127.0.0.1 -p 6379 6. Start the Redis client, open the terminal, and enter the command **redis-cli**. $redis-cli redis 127.0.0.1:6379> redis 127.0.0.1:6379> PING PONG PONG 7, disable redis // The correct way to stop Redis is to send Redis SHUTDOWN command Redis -cli SHUTDOWN // forcibly stop Redis sudo pkill Redis -serverCopy the code

The code project

Github.com/lmandlyp163…

The traditional way

We first set up a background service interface (to verify inventory, deduct inventory and create orders) without any restrictions. JMeter was used to simulate 500 concurrent threads to test and purchase 10 inventory goods

Thinking is introduced

Do not do any control, according to the process to check the inventory, inventory, order, this way there will be concurrency problems

Code implementation

Interface entry

*/ @postMapping ("/createWrongOrder/{id}") public ResponseBean createWrongOrder(@pathVariable (" ID ") Integer  id) throws Exception { Integer orderCount = seckillEvolutionService.createWrongOrder(id); Return ResponseBean(httpStatus.ok.value (), "purchase success ", orderCount); }Copy the code

Core service logic (pure DB operation) : verify inventory, deduct inventory, create order

@Override @Transactional(rollbackFor = Exception.class) public Integer createWrongOrder(Integer id) throws Exception { / / check inventory StockDto StockDto = stockDao. SelectByPrimaryKey (id); If (stockTo.getCount () <= 0) {throw new CustomException(" Out of stock "); } // stockTo.setCount (stockto.getCount () -1); stockDto.setSale(stockDto.getSale() + 1); Integer saleCount = stockDao.updateByPrimaryKey(stockDto); If (saleCount <= 0) {throw new CustomException(" CustomException failed "); } // Order StockOrderDto StockOrderDto = new StockOrderDto(); stockOrderDto.setStockId(stockDto.getId()); Integer orderCount = stockOrderDao.insertSelective(stockOrderDto); If (saleCount <= 0) {throw new CustomException(" Order failed "); } return orderCount; }Copy the code

To begin testing

To test the above code using JMeter, see Installation usage of JMeter

Initialize the database inventory

2. Configure JMeter

Open theJMeterTo add test plan emulation1000A concurrent thread tests seckill10Fill in the requested address and click the start icon to start

3, the results

Goods actually show up as sold10, inventory and0, while the order sheet doesMore than 10The data

Druid SQL analysis

  • Found SQL order and update inventory 74 times, obviously oversold inventory

Can be found under the concurrent transactions will appear error, selling super problems, this is because at the same time a large number of threads at the same time request check inventory, inventory, create orders, the three operation is not in the same atoms, for example, many threads at the same time read the inventory is 10, which have passed through the check stock judgment, so selling super problems

Use optimistic lock to control oversold

In the traditional way, oversold occurs, so in this case, the concept of lock is introduced, which is divided into optimistic lock and pessimistic lock. Pessimistic lock sacrifices performance to ensure data, so in this high concurrency scenario, optimistic lock is generally used to solve the problem

Thinking is introduced

Optimistic locks: a Labour of love for new Windows

Main modification is deduct inventory, each thread will get the goods in check inventory of optimistic locking the version number, and then accounting inventory, if the version number is wrong, will deduct fails, an exception is thrown over, so that each version number can only have one thread operation is successful, other threads of the same version number seconds kill fails, will not be selling super problems

Code implementation

Interface entry

/ * * * use optimistic locking order * / @ PostMapping ("/createOptimisticLockOrder / {id} ") public ResponseBean createOptimisticLockOrder(@PathVariable("id") Integer id) throws Exception { Integer orderCount = seckillEvolutionService.createOptimisticLockOrder(id); Return ResponseBean(httpStatus.ok.value (), "purchase success ", orderCount); }Copy the code

Core service logic (pure DB operation) : verify inventory, deduct inventory, create order

@Override @Transactional(rollbackFor = Exception.class) public Integer createOptimisticLockOrder(Integer id) throws Exception {/ / check inventory StockDto StockDto = stockDao. SelectByPrimaryKey (id); If (stockTo.getCount () <= 0) {throw new CustomException(" Out of stock "); } / / button inventory Integer saleCount = stockDao. UpdateByOptimisticLock (stockDto); If (saleCount <= 0) {throw new CustomException(" CustomException failed "); } // Order StockOrderDto StockOrderDto = new StockOrderDto(); stockOrderDto.setStockId(stockDto.getId()); Integer orderCount = stockOrderDao.insertSelective(stockOrderDto); If (saleCount <= 0) {throw new CustomException(" Order failed "); } return orderCount; }Copy the code
@update (" Update t_seckill_stock SET count = count - 1, sale = sale + 1, version = version + 1 " + "WHERE id = #{id, jdbcType = INTEGER} AND version = #{version, jdbcType = INTEGER} " + "") int updateByOptimisticLock(StockDto stockDto);Copy the code

To begin testing

To test the above code using JMeter, see Installation usage of JMeter

Initialize the database inventory

2. Configure JMeter

Open JMeter, add a test plan to simulate 1000 concurrent threads to test 10 inventory items in seconds, fill in the request address, and click the start icon to start

3, the results

Goods actually show up as sold10, inventory and0, and the order sheet is only10The data

Druid SQL analysis

Multiple threads at the same timeCheck the inventoryWill get the same optimistic lock version number of the current item, and then inBuckle inventory, if the version number is not correct, it will deduct the failure and throw an exception to end, so that each version number can only have the first thread to deduct the inventory operation successfully, other threads with the same version number will not existSell super problemsthe

Use caching & optimistic locking

Thinking is introduced

1. Generally, the Cache will be preheated in advance. We add a method to preheat the Cache after initializing the inventory, so that there will be no Cache Miss

2. I used to update the database first and then update the cache, because the calculation of the cache data here is simple, just need to add and subtract one, so we directly update the cache

3. The main transformation is to check the inventory and the method of inventory deduction. Check the inventory directly to Redis, no longer check the database, while the inventory deduction itself is the optimistic lock operation, only the operation is successful (successful inventory deduction) need to update the cache data

Code implementation

Cache warming

PostMapping("/initCache/{id}") public ResponseBean initCache(@pathVariable ("id") Integer ID) {StockDto stockDto = stockService.selectByPrimaryKey(id); Jedisutil.set (constant.prefix_count + id.tostring (), stockTo.getCount ().tostring ()); JedisUtil.set(Constant.PREFIX_SALE + id.toString(), stockDto.getSale().toString()); JedisUtil.set(Constant.PREFIX_VERSION + id.toString(), stockDto.getVersion().toString()); Return ResponseBean(httpstatus.ok.value (), "cache warmed up ", null); }Copy the code

Interface entry

/** * Use optimistic lock to place order and add read cache, Performance improvement * / @ PostMapping ("/createOptimisticLockOrderWithRedis / {id} ") public ResponseBean CreateOptimisticLockOrderWithRedis (@ PathVariable (" id "), an Integer id) throws the Exception {/ / wrong, Thread safety / / Integer orderCount = seckillEvolutionService. CreateOptimisticLockOrderWithRedisWrong (id); / / correct, thread-safe Integer orderCount = seckillEvolutionService. CreateOptimisticLockOrderWithRedisSafe (id); Return ResponseBean(httpstatus.ok.value (), "purchase successful ", null); }Copy the code

Core service logic: verify inventory cache, deduct inventory, create order

@Override @Transactional(rollbackFor = Exception.class) public Integer createOptimisticLockOrderWithRedisSafe(Integer Throws Exception {// Check the inventory // use the cache to read the inventory, reduce the DB pressure, List<String> dataList = JedisUtil. Mget (constant. PREFIX_COUNT + id, constant. PREFIX_SALE + id, Constant.PREFIX_VERSION + id); Integer count = Integer.parseInt(dataList.get(0)); Integer sale = Integer.parseInt(dataList.get(1)); Integer version = Integer.parseInt(dataList.get(2)); If (count <= 0) {throw new CustomException(" Out of stock "); } StockDto StockDto = new StockDto(); stockDto.setId(id); stockDto.setCount(count); stockDto.setSale(sale); stockDto.setVersion(version); / / button inventory Integer saleCount = stockDao. UpdateByOptimisticLock (stockDto); // The operation data is greater than 0, If (saleCount > 0) {logger.info(" version number :{} {} {}", stockTo.getCount (), stockTo.getsale (), stockDto.getVersion()); UpdateCache (stockDto); updateCache(stockDto); } if (saleCount <= 0) {throw new CustomException(" CustomException failed "); } // Order StockOrderDto StockOrderDto = new StockOrderDto(); stockOrderDto.setStockId(stockDto.getId()); Integer orderCount = stockOrderDao.insertSelective(stockOrderDto); If (saleCount <= 0) {throw new CustomException(" Order failed "); } Thread.sleep(10); return orderCount; }Copy the code
/** * Update the database first, then update the cache. Detailed database and cache consistency analysis can see * https://note.dolyw.com/cache/00-DataBaseConsistency.html * / public void updateCache (StockDto stockDto) { Integer count = stockDto.getCount() - 1; Integer sale = stockDto.getSale() + 1; Integer version = stockDto.getVersion() + 1; JedisUtil.mset(Constant.PREFIX_COUNT + stockDto.getId(), count.toString(), Constant.PREFIX_SALE + stockDto.getId(), sale.toString(), Constant.PREFIX_VERSION + stockDto.getId(), version.toString()); }Copy the code

To begin testing

To test the above code using JMeter, see Installation usage of JMeter

Initialize the cache inventory

Initialize the database inventory

2. Configure JMeter

Open JMeter, add a test plan to simulate 1000 concurrent threads to test 10 inventory items in seconds, fill in the request address, and click the start icon to start

3, the results

The item is actually shown as 10 sold, the inventory is 0, and the order table only has 10 entries

Druid SQL analysis

With caching, you can see the inventory querySQL.It’s only done once, it’s done once for cache preheating, unlike before every inventory to check the database

Distributed limiting & Caching & Optimistic locking

Thinking is introduced

As mentioned before, optimistic lock update operation still executed nearly 100 SQL times. In fact, only 10 successful storage requests are valid requests, and the rest are invalid requests. In order to comply with the principle that the number of requests landing on the database should be as small as possible, we use flow limiting here to intercept most invalid requests. Ensure that all the requests that finally reach the database are valid as far as possible. The traffic limiting algorithm refers to the traffic limiting algorithm

It is better to use fixed time window here, while Redis + Lua distributed limiting mode is used here

Code implementation

Cache warming

PostMapping("/initCache/{id}") public ResponseBean initCache(@pathVariable ("id") Integer ID) {StockDto stockDto = stockService.selectByPrimaryKey(id); Jedisutil.set (constant.prefix_count + id.tostring (), stockTo.getCount ().tostring ()); JedisUtil.set(Constant.PREFIX_SALE + id.toString(), stockDto.getSale().toString()); JedisUtil.set(Constant.PREFIX_VERSION + id.toString(), stockDto.getVersion().toString()); Return ResponseBean(httpstatus.ok.value (), "cache warmed up ", null); }Copy the code

The Lua script

  • Second limiting (how many requests are limited per second)
Each request will put the current time, accurate to the second, into Redis as the key. The timeout period is set to 2s, and Redis increments the value of the key. Redis is used to write to Lua scripts. The single thread mechanism of Redis can ensure the atomicity of each Redis request Local currentLimit = tonumber(redis. Call ('get', ARGV[1])) Key) or "0") if currentLimit > 1 then Redis. call("EXPIRE", key, 2) return currentLimit + 1 endCopy the code
  • Custom parameter flow limiting (customize how much time limits how many requests)
- Determines whether the maximum number of traffic limiting requests in the current time window exceeds the threshold. - Returns an error when the threshold is reached. Indicates that the request is restricted. If not, write to Redis using Lua scripts. If not, write to Redis using Lua scripts. If not, write to Redis using Lua scripts. Key name local requestKey = KEYS[2] Local maxRequest = tonumber(ARGV[1]) -- local nowTime = tonumber(ARGV[2]) -- timeout, Local timeRequest = tonumber(ARGV[3]) Local currentTime = tonumber(redis. Call ('get', timeKey) or "0") Local currentRequest = tonumber(redis. Call ('get', If currentTime + timeRequest > nowTime is greater than currentTime + timeRequest > nowTime then If currentRequest + 1 > maxRequest then if currentRequest + 1 > maxRequest then return 0. Redis. call("INCRBY", requestKey, 1) return currentRequest + 1; redis.call("INCRBY", requestKey, 1) return currentRequest + 1; End else -- reset after timeout, Call ('set', timeKey, nowTime) redis. Call ('set', requestKey, '0') -- set expiration time redis. Call ("EXPIRE", TimeKey, timeRequest / 1000) redis. Call ("EXPIRE", requestKey, timeRequest / 1000) -- add one redis. Call ("INCRBY", requestKey, 1) return 1; endCopy the code

Interface entry

/** * Use optimistic lock to place order and add read cache, Add a current-limiting * / @ Limit @ PostMapping ("/createOptimisticLockOrderWithRedisLimit / {id} ") public ResponseBean CreateOptimisticLockOrderWithRedisLimit (@ PathVariable (" id "), an Integer id) throws the Exception {/ / right, Thread-safe Integer orderCount = seckillEvolutionService. CreateOptimisticLockOrderWithRedisSafe (id); Return ResponseBean(httpstatus.ok.value (), "purchase successful ", null); }Copy the code

Current limiting annotations

/ / @documented @Retention(retentionPolicy.runtime) @target (elementType.method) public @interface Limit {/** * * @return */ String maxRequest() default "10"; /** * a time window (ms) * @return */ String timeRequest() default "1000"; }Copy the code

LimitAspect LimitEd-flow section

/** * LimitAspect */ @order (0) @aspect @component public class LimitAspect {/** * logger */ private static final Logger logger = LoggerFactory.getLogger(LimitAspect.class); Private static final String TIME_REQUEST = "1000"; private static final String TIME_REQUEST = "1000"; /** * RedisLimitUtil */ @Autowired private RedisLimitUtil redisLimitUtil; */ @pointcut ("@annotation(com.example.limit.Limit)") public void aspect() {} /** */ @around ("aspect() &&  @annotation(limit)") public Object Interceptor(ProceedingJoinPoint proceedingJoinPoint, Limit limit) { Object result = null; Long maxRequest = 0L; If (time_request.equals (limite.timerequest ())) {maxRequest = redisLimitUtil.limit(limit.maxRequest()); } else { maxRequest = redisLimitUtil.limit(limit.maxRequest(), limit.timeRequest()); } / / return request number greater than 0 is not current limiting the if (maxRequest > 0) {/ / release, perform subsequent methods try {result = proceedingJoinPoint. Proceed (); } catch (Throwable throwable) { throw new CustomException(throwable.getMessage()); } else {throw new CustomException(" Request is crowded, please try again later "); } return result; } @before ("aspect() &&@annotation (limit)") public void Before(limit limit) {// logger.info("before"); } // @after ("aspect() &&@annotation (limit)") public void After(limit limit) {// logger.info(" After "); }}Copy the code

RedisLimitUtil

/** * RedisLimitUtil */ @Component public class RedisLimitUtil { /** * logger */ private static final Logger logger = LoggerFactory.getLogger(RedisLimitUtil.class); Private static String LIMIT_SECKILL_SCRIPT = null; private static String LIMIT_SECKILL_SCRIPT = null; String script */ private static String LIMIT_CUSTOM_SCRIPT = null; /** * private static final String limit = "limit:"; Private static Final String LIMIT_REQUEST = "limit:request"; private static final String LIMIT_REQUEST = "limit:request"; Private static final String LIMIT_TIME = "limit:time"; private static final String LIMIT_TIME = "limit:time"; Public RedisLimitUtil() {LIMIT_SECKILL_SCRIPT = getScript("redis/limit-seckill.lua"); / / public RedisLimitUtil() {LIMIT_SECKILL_SCRIPT = getScript("redis/limit-seckill.lua"); LIMIT_CUSTOM_SCRIPT = getScript("redis/limit-custom.lua"); } /** * public Long limit(String maxRequest) { String key = LIMIT + String.valueof (System.currentTimemillis () / 1000); List<String> args = new ArrayList<>(); args.add(maxRequest); return eval(LIMIT_SECKILL_SCRIPT, Collections.singletonList(key), args); } /** * public Long limit(String maxRequest, String timeRequest) { Keys = new ArrayList<>(); List<String> keys = new ArrayList<>(); keys.add(LIMIT_TIME); keys.add(LIMIT_REQUEST); List<String> args = new ArrayList<>(); args = new ArrayList<>(); args = new ArrayList<>(); args.add(maxRequest); args.add(String.valueOf(System.currentTimeMillis())); args.add(timeRequest); return eval(LIMIT_CUSTOM_SCRIPT, keys, args); } /** * private Long eval(String script, List<String> keys, List<String> args) {// Execute the script Object result = jedisutil. eval(script, keys, args); Return (Long) result; return (Long) result; } private static String getScript(String path) {StringBuilder StringBuilder = new StringBuilder(); InputStream inputStream = RedisLimitUtil.class.getClassLoader().getResourceAsStream(path); try (BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream))) { String str; while ((str = bufferedReader.readLine()) ! = null) { stringBuilder.append(str).append(System.lineSeparator()); } } catch (IOException e) { logger.error(Arrays.toString(e.getStackTrace())); Throw new CustomException(" Problem with getting Lua limiting scripts: "+ array.toString (LLDB etStackTrace())); } return stringBuilder.toString(); }}Copy the code

JedisUtil

Public static Object eval(String script, List<String> keys, List<String> args) {Object result = null; try (Jedis jedis = jedisPool.getResource()) { result = jedis.eval(script, keys, args); return result; } catch (Exception e) {throw new CustomException(" Script =" + script + "keys=" + keys.toString() +") args=" + args.toString() + " cause=" + e.getMessage()); }}Copy the code

Core service logic: first limit flow, then check inventory cache, deduct inventory, create order

@Override @Transactional(rollbackFor = Exception.class) public Integer createOptimisticLockOrderWithRedisSafe(Integer Throws Exception {// Check the inventory // use the cache to read the inventory, reduce the DB pressure, List<String> dataList = JedisUtil. Mget (constant. PREFIX_COUNT + id, constant. PREFIX_SALE + id, Constant.PREFIX_VERSION + id); Integer count = Integer.parseInt(dataList.get(0)); Integer sale = Integer.parseInt(dataList.get(1)); Integer version = Integer.parseInt(dataList.get(2)); If (count <= 0) {throw new CustomException(" Out of stock "); } StockDto StockDto = new StockDto(); stockDto.setId(id); stockDto.setCount(count); stockDto.setSale(sale); stockDto.setVersion(version); / / button inventory Integer saleCount = stockDao. UpdateByOptimisticLock (stockDto); // The operation data is greater than 0, If (saleCount > 0) {logger.info(" version number :{} {} {}", stockTo.getCount (), stockTo.getsale (), stockDto.getVersion()); UpdateCache (stockDto); updateCache(stockDto); } if (saleCount <= 0) {throw new CustomException(" CustomException failed "); } // Order StockOrderDto StockOrderDto = new StockOrderDto(); stockOrderDto.setStockId(stockDto.getId()); Integer orderCount = stockOrderDao.insertSelective(stockOrderDto); If (saleCount <= 0) {throw new CustomException(" Order failed "); } Thread.sleep(10); return orderCount; }Copy the code
/** * Update the database first, then update the cache. Detailed database and cache consistency analysis can see * https://note.dolyw.com/cache/00-DataBaseConsistency.html * / public void updateCache (StockDto stockDto) { Integer count = stockDto.getCount() - 1; Integer sale = stockDto.getSale() + 1; Integer version = stockDto.getVersion() + 1; JedisUtil.mset(Constant.PREFIX_COUNT + stockDto.getId(), count.toString(), Constant.PREFIX_SALE + stockDto.getId(), sale.toString(), Constant.PREFIX_VERSION + stockDto.getId(), version.toString()); }Copy the code

To begin testing

To test the above code using JMeter, see Installation usage of JMeter

Initialize the cache inventory

Initialize the database inventory

2. Configure JMeter

PS: This time we set the ramp-up time to 5 seconds, which means to execute 5 seconds, 100 concurrent executions per second, because if all execute in 1S, the flow will be limited

Open JMeter, add a test plan to simulate 1000 concurrent threads to test 10 inventory items in seconds, fill in the request address, and click the start icon to start

3, the results

If we look at the background log, we can see that many requests are directly restricted by traffic limiting, which is our purpose

The item is actually shown as 10 sold, the inventory is 0, and the order table only has 10 entries

Druid SQL analysis

With the use of limiting, you can see that optimistic lock updates were executed 19 times instead of 61, and many requests were directly restricted

Asynchronous order

What else can we do to improve throughput and performance? All of our examples above are synchronous requests. We can use synchronous to asynchronous requests to improve performance

Whenever a request by the current limit and inventory check and then order information will be distributed to the message queue, such a request can return directly, consumption program do order operation, data warehousing, because the asynchronous, so eventually need to adopt the method of correction or other remind remind users to buy

reference

  • Thank you for your notes

www.mamicode.com/info-detail… Time.geekbang.org/column/arti… Note.dolyw.com/distributed… Note.dolyw.com/seckill-evo… www.cnblogs.com/stulzq/p/89…