Hardcore explains seckill design

Number has no message function, so I built a wechat communication group to facilitate communication, now need you to join, the group of big guy thief many! At present into the group full 100 red envelopes, give a blow than the opportunity to ah Sir, look forward to working with you mutually touted, common progress. By the way, I’ll see you at 18:20~20:00 later. Personal wechat: Sowhat0125

1 SEC kill scenario

Seconds kill scenes

Login 12306 train tickets grab seats
1599 yuan to buy Feitian Maotai
Tickets for Jay Chou’s concert
Double 11 second kill activity

Seckill scene focus

Strictly prevent oversold: inventory 1000 pieces sold 1020 pieces, to kill a code of farmers worship heaven! Preventing oversold is the second kill system designAt the corePart.
To prevent black productionTo prevent the malicious wool party from collecting their wool.
Ensure user experience: Under high concurrency, provide users with a friendly shopping experience, support high QPS as much as possible, etc.

Let’s refine the seckill scenario by focus.

2 edition 1 – Streaking

Streaking down the

Without thinking, come up directly according to SpringBoot + MyBatis mode for the second kill system design, the process is as follows:

ControllerLayer after the user seckill requestServiceLayer.
ServiceLayer after obtaining the request to check whether the sold data is consistent with the total inventory, consistent means that the goods are sold out, inconsistent means that there is still inventory, then callDAOLayer adds 1 to sold quantity.
DAOThe layer gets the request and passes directlyMyBatisOperation database to achieve sold quantity plus 1 with order creation.

If you use Postman to test it, it will be OK, but if you use JMeter mode, a professional concurrency testing tool, you will find order creation > inventory – sold. For example, when users A and B concurrently make SEC kill requests, stock =100 and sold =64.

User A makes A description request, at this time, the Service layer is called, and it finds that the sold is not equal to the inventory. At this time, the inventory number is 64, and USER A updates the inventory to 63, and then creates an order.
User B makes a description request, then invokes the Service layer, and finds that the sold is not equal to the inventory. At this time, the inventory number is 64, B updates the inventory to 63, and then creates an order.
At this time, the inventory was reduced by 1, but the order was created by multiple, and the order was oversold! \

No lock concurrent request, oversold

Version 2 – Pessimistic lock

The syn pessimistic locks

Since Controller defaults to singleton mode, I can use synchronized to lock and synchronize the code of the Controller layer calling the Service layer.

This will solve the oversold problem, but note that since pessimistic locks, if 1000 concurrent requests, only 1 will get the lock. 999 people will be competing for the lock.

@Transactional
@Service
@Transactional
@Slf4j
public class OrderServiceImpl implements OrderService
{
    // Check inventory
    Stock stock = checkStock(id);
    // Update inventory
    updateSale(stock);
    // Create an order
    return createOrder(stock);
}
Copy the code

You can also use Spring’s Transactional annotations to implement pessimistic locking, because using @Transactional you can control transactions that either succeed or fail

Place MySQL execution statements as close to the end of the method body as possible, because a COMMIT statement for a MySQL transaction starts at the first execution of a mysql-related statement and ends at the end of the method.
Set transaction timeout. If not, default is -1 or unlimited. And the timeout set in the transaction = time taken by the last MySQL statement + all time taken before the last MySQL statement.

Note: the pessimistic lock state guarantees that the item will be sold. If the thread does not get the lock, it will block waiting to get the lock. But its blocking can also give users a very bad experience.

Version 3 – Optimistic Lock

MySQL version number

We assign a version number to each number of sold data, get the number of sold data and the corresponding version number when the Service layer calls, and update the number of sold data and the version number at the same time. If the update is successful, it means that the snap up is successful. If the update fails, it means that the snap up fails. At this point, you will find that the faster the hand speed is, the better it will be, but at least it will not be oversold.

Update inventory table set Sold count = Sold count +1, version number = version number +1Where id =#{id} and version =#{version}Copy the code

Note: in the optimistic lock state, because it is random seconds kill failure, so there may be a few unsold after the event!

Version 4 – Current limiting

The core oversold problem has been solved, the next is a variety of optimization methods. In the case of high concurrent requests, if traffic limiting is not implemented on the interface, the background server will be under great pressure. Therefore, the seckill system is usually deployed on a server to avoid affecting other services, and traffic limiting is also set.

The commonly used traffic limiting methods include leaky bucket algorithm and token bucket algorithm, as we have mentioned in Redis. RateLimiter, Google’s open source project Guava, uses the token bucket control algorithm. When developing high-concurrency systems, there are three powerful tools to protect the system: caching, degradation, and limiting traffic

Caching: The purpose of caching is to improve system access speed and increase system processing capacity.
Downgrading: Downgrading is the strategic downgrading of some services and pages based on current business conditions and traffic when the server pressure increases dramatically, so as to release server resources and ensure the normal operation of core tasks.
Traffic limiting: The purpose of traffic limiting is to limit the rate of concurrent access/requests or requests within a time window to protect the system. Once the rate reaches the limit, the system can deny services, queue or wait, and degrade.

5.1 Leaky bucket Algorithm

Leaky bucket algorithm idea: water is compared to the request, and the leaky bucket is compared to the limit of the system’s processing capacity. Water enters the leaky bucket first, and the water in the leaky bucket flows out at a certain rate. When the outflow rate is less than the inflow rate, the subsequent water directly flows out (rejects the request) due to the limited capacity of the leaky bucket, so as to achieve flow limiting. \

5.2 Token bucket algorithm

Token bucket algorithm principle: can be understood as the hospital registration to see a doctor, only after the number can be treated. \

The process is roughly:

All requests require an available token before they can be processed.
Set the rate at which tokens are added to the bucket based on the flow limiting size.
Set the maximum bucket capacity so that newly added tokens are discarded or rejected when the bucket is full.
After the request is received, the token in the token bucket should be obtained first, and then other business logic can be carried out after holding the token. After processing the business logic, the token can be deleted directly.
If the user cannot obtain the token, he can choose to block and wait, or he can choose to set the timeout mechanism.
The token bucket has a minimum limit, and when the number of tokens in the bucket reaches the minimum limit, the token will not be deleted after the request is processed to ensure adequate flow limiting.

In engineering, token bucket algorithm is generally used for most, and RateLimiter in Google’s Guava is generally used.

// Create a token bucket instance
private RateLimiter rateLimiter = RateLimiter.create(20);
// block to get the token to proceed
rateLimiter.acquire();
// Wait 3 seconds to see if you can get the token, return Boolean.
rateLimiter.tryAcquire(3, TimeUnit.SECONDS) 
Copy the code

6th edition 5 – Detail optimization

With optimistic locking and limiting, we can think about the details of writing.

Seconds kill to have a time limit, can no longer accept any time seconds kill request, to implementflash.
What if an IT person who knows how to capture packets obtains the address of the seckill interface? To implementThe second kill interface is hidden.
The number of accesses per user per unit of timeFrequency limit.

6.1 Flash sale

Very simple, put the second kill item into Redis and set a timeout. For example, we use kill + item ID as key and item ID as value to set a timeout of 180 seconds.

127.0. 01.:6379> set kill1 1 EX 180
OK
Copy the code

Add time check:

public Integer createOrder(Integer id) {
    // Redis check buying time
    if(! stringRedisTemplate.hasKey("kill" + id)){
        throw new RuntimeException("Seckill timeout, the activity is over!!");
    }
    // Check inventory
    Stock stock = checkStock(id);
    / / inventory
    updateSale(stock);
    / / order
    return createOrder(stock);
}
Copy the code

6.2 Seconds kill interface hidden

Interface is hidden

The user obtains an MD5 value of the requested URL through getMd5 method before seckilling.
Request getMd5 algorithm, Key = Commodity ID + user ID, value = Commodity ID + user ID + salt. Store KV in Redis and set expiration time. Finally return value as MD5 value.
The user needs to carry the MD5 value when requesting the seckill URL. Then the Service layer will obtain the corresponding value from Redis based on the product ID + user ID to check whether the value is consistent with the MD5 value. The next step is absolutely necessary.

// Generate an MD5 based on the item ID and user ID.
@Override
public String getMd5(Integer id, Integer userid) {
  // Verify the user's validity
  User user = userDAO.findById(userid);
  if(user==null)throw new RuntimeException("User information does not exist!");

  // Verify that the goods are legitimate
  Stock stock = stockDAO.checkStock(id);
  if(stock==null) throw new RuntimeException("Commodity information is illegal!");

  String hashKey = "KEY_" + userid + "_" + id;
  // Generate MD5, here! A # is a salt that can be randomly generated with a Random.
  String key = DigestUtils.md5DigestAsHex((userid + id + ! "" AW#").getBytes());
  stringRedisTemplate.opsForValue().set(hashKey, key, 3600, TimeUnit.SECONDS);
  return key;
}
Copy the code

At this point, if the user directly requests the second kill interface will be limited, but if the hacker technology upgrade, will request MD5 and request second kill interface written together, still can not prevent being pulled wool! Do how? Limit the frequency of user access.

6.3 Access Frequency Limit

After passing the above request, a key in REDis is generated according to the user ID. The value is the access times, and the default value is 0. In addition, the expiration time of this KV is set.
Before verifying whether to pass seckill hidden interface authentication, the user checks the number of access times per unit of time. If the number of access times exceeds the threshold, the user rejects the request. If the number of access times exceeds the threshold, the user performs hidden interface authentication again.
This section uses the user access limit as an example. The IP access limit is similar.
Second kill source public number replySeconds killTo obtain.

Access frequency limit

Version 6 – Many details optimized

CDN acceleration: why jingdong logistics fast, because people across the country configured a number of warehouses. Similarly, we can configure some static things in the front end in different places across the country. When users request, they can directly request the front-end resources nearest to them.
The front button is gray: if you have participated in the seckill activity, you will find that the seckill button is gray before the seckill time, and can be clicked only when the time is up. And the second to kill the beginning of the role is not always can point, may only allow 1 second point 10 times that kind of.
Nginx load balancing: a Tomcat QPS is generally around 200~1000, if taobao or jd nature of the second kill, you need to set up an Nginx load balancing to support tens of thousands of levels of concurrency.
Information storage in Redis: MySQL alone cannot support tens of thousands of QPS, since Redis claims to support 10W level QPS, we save data information in Redis. Some people may say that MySQL has optimistic locking and transactionality, Redis does not have transactionality, in fact, we can use Lua script to achieve transactional operation in concurrent Redis.
Message-oriented middleware – Peak traffic Clipping: If the number of successful seconds is too large, it is not appropriate to write all orders directly to MySQL. You can write the user information of successful seconds to message-oriented middleware. RabbitMQ, Kafka, for example, returns a message to the user that the order was successfully bought, and then consumes middleware information (order generation, data persistence). Since it is asynchronous consumption, to prevent the user from seeing the order after the order is successfully killed, the user is prompted to submit the order before the order is generated. Inform the user of success when the order asynchronous consumption succeeds.
Auxiliary means: it is necessary to do a rehearsal before the second kill, and QPS monitoring, CPU monitoring, IO monitoring and cache monitoring are also necessary after the system goes online. Fuses and current limiting should also be taken into account if the service really fails.
Short URL: sometimes you other people send you a super short URL you open after the direct jump to the daily see the shopping page, this involves the short URL mapping, the general idea is to do a link mapping, on this basis can also play a variety of tricks, anyway very interesting (interested can water a). \

Second kill general flow chart
Industrialization: At least you will have access to MQ, SpringBoot, Redis, Dubbo, ZK, Maven, Lua, and more. I also found a great project on GitHub. Public number reply ** seconds to kill ** can be obtained.

8 reference

B station: b23. TV/IsifGk
Github:github.com/qiurunze123…