Do you know how 12306 supports millions of QPS?

12306 Grab tickets, the limit of concurrent thinking?

Every holiday season, people in first-tier and second-tier cities who return to their hometowns and go out to play almost all face a problem: grabbing train tickets! Although most of the cases can book tickets now, but put tickets instant that no ticket scene, I believe we have a deep experience. Especially during the Spring Festival, people not only use 12306, but also consider Zhixing and other ticket-snatching apps, as hundreds of millions of people across the country scramble for tickets during this time. The “12306 service” is burdened with QPS that is unsurpassed by any seckill system in the world, and millions of concurrent requests are perfectly normal! The author specially studied the server architecture of “12306” and learned many highlights of its system design. Here, I would like to share with you and simulate an example: how to provide normal and stable service when 1 million people grab 10,000 train tickets at the same time. Github code address

1. Large and highly concurrent system architecture

The system architecture with high concurrency adopts distributed cluster deployment. Load balancing is implemented at the upper layer of services and various Dr Methods (such as dual-fire room, node fault tolerance, and server DISASTER recovery) are provided to ensure high availability of the system. Traffic is also balanced to different servers based on load capacity and configuration policies. Here’s a simple diagram:

1.1 Overview of Load Balancing

In the figure above, the user requests to the server go through three layers of load balancing. The following three types of load balancing are briefly introduced:

OSPF(Open Shortest Link First) is an Interior Gateway Protocol (IGP). OSPF advertises the status of network interfaces between routers to establish a link-state database and generate a shortest path tree. OSPF automatically calculates the Cost value of the routed interface. However, OSPF can manually specify the Cost value of the interface. The Cost calculated by OSPF is inversely proportional to the interface bandwidth. The higher the bandwidth, the lower the Cost. Load balancing can be performed on paths that reach the same Cost value. A maximum of six links can perform load balancing simultaneously.
LVS (Linux VirtualServer) is a Cluster technology, using IP load balancing technology and content – based request distribution technology. The scheduler has a good throughput rate, transfers requests to different servers evenly for execution, and the scheduler automatically shields the server failures, thus forming a group of servers into a high performance, high availability virtual server.
Nginx is a very high performance HTTP proxy/reverse proxy server. It is often used for load balancing in service development. Nginx implements load balancing in three main ways: polling, weighted polling, and IP hash polling. In the following section, we will configure and test weighted polling for Nginx

1.2 Nginx weighted polling demo

Nginx implements load balancing through the upstream module, where weighted polling is configured to assign a weight value to related services and may be configured based on server performance and load capacity. The following is a weighted polling load configuration. I will configure the weight of 1,2,3,4 respectively on the local listening port 3001-3004:

Upstream load_rule {server 127.0.0.1:3001 weight=1; Server 127.0.0.1:3002 weight = 2; Server 127.0.0.1:3003 weight = 3; Server 127.0.0.1:3004 weight = 4; }... server { listen 80; server_name load_balance.com www.load_balance.com; location / { proxy_pass http://load_rule; }} Copy the codeCopy the code

I configured the virtual domain address www.load\ _balance-com in the local /etc/hosts directory, and then started the four HTTP port listening services using Go language. The following is the Go program listening on port 3001, and the other several only need to change the ports:

package main import ( "net/http" "os" "strings" ) func main() { http.HandleFunc("/buy/ticket", Func handleReq(w http.responsewriter, ListenAndServe(":3001", nil)} R *http.Request) {failedMsg := "handle in port:" writeLog(failedMsg, "./stat.log")} logPath string) { fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.Close() content := strings.Join([]string{msg, "\r\n"}, "3001") buf := []byte(content) fd.write (buf)} Copy codeCopy the code

I wrote the requested port log information to the./stat.log file and used the AB pressure tool to do the pressure test:

Ab-n 1000-c 100 http://www.load_balance.com/buy/ticket Copy the codeCopy the code

Statistics log results, port 3001-3004 get 100, 200, 300, 400 requests respectively, which is in good agreement with the weight ratio configured in Nginx, and the load of traffic is very uniform, random. Upsteam: Load balancing (upsteam) : load balancing (upsteam) : Load balancing (upsteam) : load balancing (upsteam)

Computer Classics required reading list (including download methods)

2. Selection of seckilling system

Back to our original question: how can the ticket-ticket seckill system provide normal and stable service under high concurrency?

From the above introduction, we know that the user slasher traffic is evenly distributed to different servers through layer upon layer of load balancing. Even so, the QPS borne by a single machine in the cluster is very high. How to optimize single machine performance to the utmost? To solve this problem, we’re going to want to understand one thing: will usually be generated reservation system to deal with the order, reduce inventory, users pay for the three basic stages, are we going to do is to ensure that system train ticket order not oversold, a lot of sell, each selling tickets must be paid to effective, but also guarantee system under high concurrency. What would be a more reasonable allocation of the three stages? Let’s break it down:

2.1 Order and reduce inventory

When a concurrent request from the user arrives at the server, the order is first created, then the inventory is deducted and the payment is waiting for the user. This sequence is the first solution that most of us think of, and in this case it also ensures that the order will not be oversold, because after the order is created it will be destocked, which is an atomic operation. However, this can also cause some problems. First, in extreme concurrency, the details of any memory operation can affect performance, especially the logic like creating orders, which generally need to be stored in the disk database, the pressure on the database is imagined. Second, if users place malicious orders, they will only place orders without paying, which will reduce inventory and sell many orders. Although the server can limit the number of IP and user purchase orders, this is not a good method.

2.2 Payment for inventory reduction

If waiting for users to pay the order to reduce inventory, the first feeling is not less to sell. However, this is a big no-no in concurrent architecture, because in extreme concurrency, users may create many orders, and when inventory is reduced to zero, many users find that they cannot pay for the orders they have taken, which is known as “oversold”. Concurrent operation of database disk IO is also not avoided

2.3 Withholding inventory

From the above two scenarios, we can conclude that as long as the order is created, we need to operate database IO frequently. So is there a solution that does not require direct operation of database IO, this is the withholding inventory. First deduct the inventory, to ensure not oversold, and then asynchronously generate user orders, so that the response to the user will be much faster; So how do you make sure you sell a lot? What if the user gets the order and doesn’t pay? As we all know, the order has an expiration date. For example, if the user does not pay within five minutes, the order will become invalid. Once the order becomes invalid, new inventory will be added. Orders are generated asynchronously and are typically processed in real-time consumption queues such as MQ and Kafka. Orders can be generated very quickly when orders are small and users rarely need to queue.

3. The art of withholding inventory

From the above analysis, it is obvious that the most reasonable plan of pre-storage. We further analyze the details of the inventory, there is still a lot of room for optimization, where is the inventory? How to ensure high concurrency, correct inventory, and fast response to user requests?

In the case of single machine with low concurrency, we usually realize the inventory withholding like this:

In order to ensure the atomicity of inventory deduction and order generation, transaction processing is needed, and then inventory judgment, inventory reduction, and finally transaction submission. The whole process has a lot of IO, and the operation of the database is blocked. This approach is simply not suitable for highly concurrent seckill systems.

Next, we optimize the plan of single storage: local storage. We allocate a certain amount of inventory to the local machine, destock it directly in memory, and then create the order asynchronously as we did before. The improved stand-alone system looks like this:

In this way, frequent IO operations on the database are avoided, and only operations are performed in memory, which greatly improves the anti-concurrency capability of the single machine. However, a single machine with millions of user requests is not robust anyway, and while Nginx uses the epoll model to handle network requests, the C10K problem has long been solved in the industry. In Linux, however, all resources are files, as are network requests, and a large number of file descriptors can make the operating system instantly unresponsive. We mentioned nginx’s weighted balancing strategy above. Let’s assume that the 100W user requests are evenly balanced across 100 servers, so that a single server can handle much less concurrency. Then we have a local stock of 100 train tickets per machine, and the total stock on 100 servers is still 10,000, which ensures that the stock orders are not oversold. The cluster architecture we describe is as follows:

The problem is that with high concurrency, we can’t guarantee high availability right now, if two or three of those 100 servers go down because they can’t handle concurrent traffic or whatever. Then the orders on these servers are not sold, which results in fewer orders being sold. To solve this problem, we need to do unified management of the total order quantity, which is the next fault tolerance solution. The server should not only reduce inventory locally, but also reduce inventory remotely and uniformly. With the remote unified destocking operation, we can allocate some extra “buffer stock” to each machine based on the machine load in case any machine in the machine goes down. Let’s make a specific analysis based on the following architecture diagram:

We use Redis to store unified inventory because of the high performance of Redis, which boasts a single QPS with 10W concurrency. After the local inventory reduction, if there is a local order, we will request Redis to remotely reduce the inventory. After the local inventory reduction and remote inventory reduction are successful, we will return to the user the prompt of the success of the ticket purchase, which can effectively ensure that the order will not be oversold. When some machines break down, there are reserved buffer tickets on each machine, so the remaining tickets on the machine can still be made up on other machines, ensuring a lot of sales. In theory, the more buffer is set, the more machines the system can tolerate downtime. However, too large a buffer will also have a certain impact on Redis. In fact, the number of requests to Redis is the total amount of the local inventory and buffer inventory. When the local inventory is insufficient, the system directly returns the message “sold out” to the user, so the logic of unified inventory is no longer used. To some extent, this also prevents redis from being overwhelmed by huge network requests, so the number of buffer values needs to be carefully considered by the architect.

Java to master the interview with the most complete information package (including download methods)

4. Code demonstration

Go language is originally designed for concurrency. I use Go language to show you the specific process of single ticket snatching.

4.1 Initialization

The init function in the GO package is executed before the main function, and it does some preparatory work at this stage. Our system needs to do the following preparatory work: initialize the local inventory, initialize the remote Redis hash key value to store the unified inventory, initialize the Redis connection pool; You also need to initialize a size of 1 type int chan, the purpose is to realize the distributed lock function, also can be used directly read-write lock or use redis, and other ways to avoid competition for resources, but the use of channel is more efficient, this is the philosophy of the language: don’t communication through the Shared memory, rather than through the communication to the Shared memory. The redis library uses Redigo, and here is the code:

. Package localSpike type localSpike struct {LocalInStock int64 LocalSalesVolume int64} struct {LocalInStock int64 LocalSalesVolume int64}... Package remoteSpike RemoteSpikeKeys struct {SpikeOrderHashKey string QuantityOfOrderKey string // Number of existing orders in hash key} Func NewPool() *redis.Pool {return &redis.Pool{MaxIdle: 10000, MaxActive: 12000, // max number of connections Dial: func() (redis.Conn, error) { c, err := redis.Dial("tcp", ":6379") if err ! = nil { panic(err.Error()) } return c, err }, } } ... func init() { localSpike = localSpike2.LocalSpike{ LocalInStock: 150, LocalSalesVolume: 0, } remoteSpike = remoteSpike2.RemoteSpikeKeys{ SpikeOrderHashKey: "ticket_hash_key", TotalInventoryKey: "ticket_total_nums", QuantityOfOrderKey: "Ticket_sold_nums ",} redisPool = remotespike2. NewPool() done = make(chan int, 1) done < -1Copy the code

4.2 Local and unified withholding stock

The logic of the local inventory is very simple, the user requests to add the sales volume, and then compares whether the sales volume is greater than the local inventory, returns bool:

Package localSpike // LocalDeductionStock. Mandatory bool func (spike * localSpike) LocalDeductionStock() bool{spike.LocalSalesVolume = Spike.LocalSalesVolume + 1 return spike.LocalSalesVolume < spike.LocalInStock} Copy the codeCopy the code

Note that the shared LocalSalesVolume operation is implemented using a lock, but since the local and unified store are atomic operations, it is implemented using a channel at the top level, which will be covered later. Unified inventory operation redis, because Redis is a single thread, and we need to achieve data from it, write data and calculate a series of steps, we need to coordinate with lua script packaging command, to ensure the atomicity of the operation:

package remoteSpike ...... const LuaScript = ` local ticket_key = KEYS[1] local ticket_total_key = ARGV[1] local ticket_sold_key = ARGV[2] local ticket_total_nums = tonumber(redis.call('HGET', ticket_key, ticket_total_key)) local ticket_sold_nums = tonumber(redis.call('HGET', ticket_key, If (ticket_total_nums >= ticket_sold_nums) then return redis. Call ('HINCRBY', ticket_key, ticket_sold_key, Func (RemoteSpikeKeys *RemoteSpikeKeys) RemoteDeductionStock(conn Redis.Conn) bool {lua := redis.NewScript(1, LuaScript) result, err := redis.Int(lua.Do(conn, RemoteSpikeKeys.SpikeOrderHashKey, RemoteSpikeKeys.TotalInventoryKey, RemoteSpikeKeys.QuantityOfOrderKey)) if err ! = nil { return false } return result ! = 0} Copy the codeCopy the code

We use a hash structure to store the total inventory and total sales, and when the user requests it, it determines whether the total sales volume is greater than the inventory and returns the relevant bool. Before starting the service, we need to initialize the initial inventory information of Redis:

Hmset ticket_hash_key "ticket_total_nums" 10000 "ticket_sold_nums" 0 Copy codeCopy the code

4.3 Responding to User Information

We start an HTTP service that listens on a port:

package main ... Func main() {http.handlefunc ("/buy/ticket", handleReq) http.listenandServe (":3005", nil)} Copy the codeCopy the code

Now that we’ve done all the initialization, the logic of handleReq is pretty clear.

Func handleReq(w http.ResponseWriter, R * HTTP Request) {redisConn: = redisPool. Get LogMsg () : = "" < - done. / / the global read-write lock if localSpike LocalDeductionStock && () RemoteSpike. RemoteDeductionStock (redisConn) {util. RespJson (w, 1, "rob tickets success", nil) LogMsg = LogMsg + "result:1,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, } else {util.RespJson(w, -1, "sold out ", nil) LogMsg = LogMsg + "result:0,localSales:" + strconv.FormatInt(localSpike.LocalSalesVolume, // stat.log)} func writeLog(MSG string, logPath string) {fd, _ := os.OpenFile(logPath, os.O_RDWR|os.O_CREATE|os.O_APPEND, 0644) defer fd.close () content := strings.join ([]string{MSG, "\r\n"}, "") buf := []byte(content) fd.write (buf)Copy the code

As mentioned above, we need to consider race conditions when we deduct inventory. We use channel to avoid concurrent read and write, which ensures efficient sequential execution of requests. We write the interface return information to the./stat.log file for pressure measurement statistics.

4.4 Single-machine service pressure test

To start the service, we use ab pressure testing tool to test:

Ab - 10000 - c n 100 http://127.0.0.1:3005/buy/ticket duplicate codeCopy the code

Below is my local pressure test information with low MAC

This is ApacheBench, Version 2.3 <$Revision: 1826891 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Benchmarking 127.0.0.1 (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests Server Software: Server Hostname: 127.0.0.1 Server Port: 3005 Document Path: /buy/ Ticket Document Length: 29 bytes Concurrency Level: 100 Time taken for Tests: 2.339 seconds Complete Requests: 10000 Failed requests: 0 Total transferred: 1370000 bytes HTML transferred: 290000 Bytes Requests per second: 4275.96 [#/ SEC] (mean) Time per request: [MS] (mean) Time per request: 0.234 [MS] (mean, across all concurrent requests) Transfer rate: 572.08 [Kbytes/ SEC] Received Connection Times (ms) min mean[+/-sd] Median Max Connect: 08 14.7 6 223 Processing: 2 15 17.6 11 232 Waiting: 1 11 13.5 8 225 Total: 7 23 22.8 18 239 Percentage of the requests served within a certain time (MS) 50% 18 66% 24 75% 26 80% 28 90% 33 95% 39 98% 45 99% 54 100% 239 (Longest Request) Replicates the codeCopy the code

According to the indicators, I can process 4000+ requests per second on a single server. Normal servers are multi-core configuration, and there is no problem in processing 1W+ requests. In addition, it is found by checking logs that requests are normal, traffic is even and Redis is normal in the whole service process:

//stat.log
...
result:1,localSales:145
result:1,localSales:146
result:1,localSales:147
result:1,localSales:148
result:1,localSales:149
result:1,localSales:150
result:0,localSales:151
result:0,localSales:152
result:0,localSales:153
result:0,localSales:154
result:0,localSales:156
...
Copy the code

I myself liver six copies of PDF < about Java entry to god >, the whole network spread more than 10W +, search “code farmers attack” after paying attention to the public number, in the background reply PDF, get all PDF

Six PDF links

5. Review

In general, seckill systems are very complex. Here we just briefly introduced the simulation of how to optimize the single machine to high performance, how to avoid the single point of failure of the cluster, to ensure that the order is not oversold, sell a few strategies, a complete order system and order progress view, each server has a task, Periodically synchronize outstanding invoices and inventory information from the total inventory to display to users, and users do not pay within the validity period of the order, release the order, replenish the inventory and so on.

We realized the high concurrency rob tickets at the core of logic, system design, so to speak, very clever, clever avoided for DB database IO operations, high concurrent requests for Redis network IO, almost all of the calculation is done in memory, and effectively guarantee the oversold, a lot to sell, also can tolerate some machine downtime. I think two of them are worth learning and summarizing in particular:

Load balancing, divide and conquer. Through load balancing, divide the flow of different to different machines, each machine processing your request, will give their performance to the maximum, this system is able to withstand high concurrency, like the work of a team, everyone will be their value to an extreme, nature is a great team growth.
Use concurrency and asynchrony wisely. Since the epoll network architecture model solved the C10K problem, asynchronism has been more and more accepted by the server-side developers. The work that can be done with async is done with async, and unexpected effects can be achieved in function dismantling, which can be reflected in Nginx, Node.js and Redis. The epoll model they use to process network requests shows how powerful single threads can still be. The server has entered the multi-core era. Go language, a language born for concurrency, perfectly gives play to the advantages of multi-core server. Many tasks that can be processed concurrently can be solved by using concurrency. In a word: how to reasonably squeeze CPU, let it play its due value, is we always need to explore the direction of learning.

Welcome to pay attention to my public number “code farmers attack”. Share Python, Java, big data, machine learning, artificial intelligence and other technologies, pay attention to code farming technology improvement, career breakthrough, thinking transition, 100,000 + code farming growth charge first stop, accompany you have a dream to grow together.