Chapter 3 Concurrent Processing Capability of Server

3.1 the throughput
  • Throughput rate: The number of requests handled by the Web server per unit of time
  • Maximum throughput rate: The maximum number of requests a server can handle per unit of time
  • Stress test: Simulate a large number of concurrent users continuously sending HTTP requests, count the total duration of the test, and calculate the throughput rate accordingly. (Typically, the model is simplified to stress test specific requests on a representative basis)
3.1.1 Preconditions of the stress test
  • Concurrent users: The total number of users simultaneously sending requests to the server at a given time.
  • The total number of requests
  • Request resource description
3.1.2 Request waiting time
  • Average user wait time: Quality of service for a single user, given the number of concurrent users
  • Average server request processing time: the inverse of throughput
  • The Web server uses multiple execution streams to process requests concurrently (interleashing CPU time slices), resulting in longer wait times for individual user requests, but a decrease in the average server request time.
3.1.3 Do a stress test
  • Stress test software: AB, LoadRunner, JMeter
  • Ab test:

    AB-N1000-C10 http://xxx.com/test.html # -N1000 represents the total number of requests 1000 # -C10 represents the number of concurrent users 10

    Request the results

    Concurrency Level: 10 Time taken for tests: 7.616 seconds Complete requests: 1000 Failed requests: 0 Non-2XX Responses: 1000 Total transferred: 374000 bytes HTML transferred: 178000 bytes Requests per second: 131.30 [#/ SEC] (mean) Time per request: 131.30 [ms] (mean) Time per request: 131.30 [#/ SEC] (mean) Time per request: [MS] (Mean, Across All Concurrent Requests) Transfer Rate: 47.96 [Kbytes/ SEC] Received

    Requests per second = Complete Requests/Time taken for tests Time per request = Time per request/Time per request / Complete requests Time per request(Across All Concurrent Requests) = Time taken for tests / Complete requests

3.1.4 Continue stress tests
  • The design of concurrency strategy is to coordinate and make full use of CPU computation and I/O operations when the server is processing more requests at the same time, so that it can provide a high throughput rate under the condition of large concurrent users.
3.2 CPU concurrent calculation
  • Process (fork) : Schedured by the kernel, acts as an entity that allocates resources and is associated with data through a process descriptor
  • Child process: Copies all data from the parent process to its own address space, inheriting all context information from the parent process
  • Lightweight Processes (CLONES) : Allows sharing of address Spaces, files, and data
  • Thread: Pthread completes multithreading management in user mode; LinuxThreads associates threads with lightweight processes one-to-one and is managed by the kernel process scheduler
  • Scheduler: Maintains a queue of all runnable processes and a list of all dormant processes and zombie processes
  • Process priority: Viewed from top

    PR priority property: The length of the time slice assigned to the process by the process scheduler, in units of the number of clocks. Linux normally uses the NI priority dynamic adjustment value of 10ms
  • System load: There is at least one process in the run queue at any given time, and that is the running process.

    Cat /proc/loadavg 0.00 0.01 0.00 1/382 4154 Number of processes in the running queue at this time // Total number of processes in 382 // 4154 Process ID last created
  • Process switch: Suspends a running process and resumes a previously suspended process, called a process switch (context switch)
  • Process suspension: The process’s data in the CPU register is temporarily stored in the kernel state stack
  • Nmon monitors: ContextSwitch context switches per second
  • View the number of HTTPD processes at a time

    ps afx | grep httpd | wc -l
  • Conclusion: If you want your server to support a high number of concurrency, you need to minimize the number of context switches. The easiest way to do this is to reduce the number of processes, use threads and design concurrency strategies in conjunction with other I/O models
  • IOWAIT: The proportion of time the CPU is idle and waiting for an I/O operation to complete (CPU monitoring data via Nmon)
  • Lock contention: When one task holds a locked resource, the other tasks are waiting for the lock to be released. Minimize the use of shared resources by concurrent requests.
3.3 System call
  • A system call (English: system call), also known as a system call, is a request from a program running in user space to the operating system kernel for a service that requires higher permissions to run.
  • User mode and kernel mode are the two runlevels of Linux processes, and processes can switch between the two modes.
  • On the Linux system, the application code uses the functions encapsulated by the glibc library to use the system call indirectly.
  • System calls involve a process switching from user to kernel state, resulting in a certain amount of memory space swapping, so Web server optimization should reduce unnecessary system calls.
  • Use strace to track child processes
3.4 Memory Allocation
  • Frequent memory allocation and release will cause memory consolidation for a certain period of time, affecting performance
  • Apache is a multi-process model that uses a memory pool to manage memory and starts running by applying a large chunk of memory as a memory pool at once
  • Nginx is a single-process model that uses multiple threads to handle requests and can share memory resources between threads