Tomcat high concurrency source code unpacking and performance tuning

Deconstructing Tomcat architecture design from god’s perspective, after understanding the whole component design idea. We need to get down to the nitty-gritty implementation of each component. From far to near, the architecture gives people macro thinking and details show full beauty. Watch “Code brother byte” for more hardcore, you, ready?

Last time, “Code brother byte” disassembled the Tomcat architecture design from the perspective of God, analyzed how Tomcat started and stopped, and completed the acceptance and response of a request through the design of connection pool and container. Connectors communicate externally, handle socket connections, and containers handle internally, load servlets, and handle specific requests and responses. Details point me into the transfer gate: Tomcat architecture parsing to work reference.

High concurrency disassembly core preparation

This time, we will break it down again, focusing on Tomcat’s high concurrency design and performance tuning, so that we can have a higher level of understanding and perception of the entire architecture. Among them, the design ideas of each component are Java object-oriented, interface oriented, how to encapsulate changes and changeless, how to abstract different components according to the actual needs, how to design classes to achieve a single responsibility, how to achieve high cohesion and low coupling of similar functions, and the design pattern is applied to the ultimate learning and reference.

This time I’ll cover the I/O model and the basics of thread pools.

Before learning, I hope you can accumulate the following technical content, many of the content “code brother byte” has also been shared in the historical articles. You can climb the stairs to review… . I hope you pay attention to the following knowledge points, in the following knowledge points to disassemble Tomcat, it will get twice the result with half the effort, otherwise it is easy to get lost and not the method.

Let’s take a look at how Tomcat achieves concurrent connection processing and task processing. Performance optimization is that each component plays a corresponding role, how to use the least memory, the fastest execution is our goal.

Design patterns

Template method pattern: Abstract algorithm processes in abstract classes that encapsulate changes and invariants in the process. Delay the change point to the subclass implementation, code reuse, open and close principle.

Observer mode: According to the requirement scenarios of different response mechanisms for different components of events, it can achieve decoupling and flexible notification to the downstream.

Chain of responsibility pattern: Objects are connected into a chain along which requests are passed. Valve in Tomcat is an application of this design pattern.

More design patterns can be found in the previous design patterns album of “Code Brother Byte”, here is portal.

I/O model

It is necessary to understand the concepts related to synchronous blocking, asynchronous blocking, I/O multiplexing, asynchronous non-blocking and the application of Java NIO package. This article also highlights how I/O can be used in Tomcat to achieve high concurrency. Through this article I believe you will also have a deep understanding of the I/O model.

Java concurrent programming

To achieve high concurrency, in addition to the overall elegant design of each component, the reasonable design mode, the use of I/O, but also need the threading model, how to efficiently concurrent programming skills. In the process of high concurrency, it is inevitable that multiple threads will access the shared variable, which needs to be implemented by locking. How to effectively reduce lock conflicts? Therefore, as a programmer, make a conscious effort to avoid the use of locks, such as using atomic CAS classes or concurrent collections instead. If you have to use a lock, minimize the scope and strength of the lock.

For concurrent related basic knowledge, if readers interested in “the elder brother of the code byte” behind also make arrangements for you, now also write some of the concurrent album, you can go through history articles or album, here is the portal, mainly explained the concurrent implementation principle, what is the memory of visibility, JMM memory model, concurrent knowledge such as read-write lock.

Overall Architecture of Tomcat

To review the overall architecture of Tomcat, the Connector is designed to process TCP/IP connections and the Container is used as a Servlet container to process specific service requests. Abstract two components respectively externally and internally to achieve expansion.

A Tomcat instance has one Service by default, and a Service can contain multiple connectors. The connector consists of ProtocalHandler and Adapter to perform core functions of the connector.
ProtocolHandlerMainly by theAcceptorAs well asSocketProcessorTCP/IP layer Socket read and convertTomcatRequest 和 TomcatResponseAnd finally get the appropriate one according to HTTP or AJP protocolProcessorTomcatRequest and TomcatResponse are transformed into standard ServletRequest and ServletResponse through Adapter. throughgetAdapter().service(request, response);Pass the request to the Container Container.
The adapter.service() implementation forwards the request to the containerorg.apache.catalina.connector.CoyoteAdapter

// Calling the container
connector.getService().getContainer().getPipeline().getFirst().invoke(
                        request, response);
Copy the code

This call triggers the chain of responsibility pattern of getPipeline that moves the request through the container, each of which has a Pipeline, starting with First and ending with Basic, into the subclass container held inside the container, and finally into the Servlet, This is the classic use of the chain of responsibility model. The specific source component is a Pipeline that forms a request chain, with each chain point made up of Valve. “Code brother byte” has been explained in detail in the last article on Tomcat architecture analysis to work reference. As shown in the figure below, the important components of the architecture design of the whole Tomcat are clearly visible. I hope you can keep this global architecture diagram in your mind and master the overall idea to better analyze the beauty of details.

Startup process: What happened to the startup.sh script

Tomcat is a Java application, so the startup.sh script starts a JVM to run Tomcat’s Bootstrap class.
Bootstrap instantiates Catalina and initializes Tomcat’s custom classloader. Hot loading and hot deployment depend on him.
Catalina: Parses server.xml to create the server component and calls the server.start () method.
Server: Manages the Service component by calling the start() method of Server.
Service: The main responsibility is to manage the top-level container Engine of the introducer, called separatelyConnector 和 Engine 的 startMethods.

The Engine Container is primarily a composite pattern that associates containers in parent-child relationships, and the Container inherits Lifecycle to initialize and start each Container. Lifecycle defines init(), start(), stop() to control the Lifecycle of container components with one click.

Lifecycle Lifecycle is managed by LifecycleBase abstract class. Lifecycle is managed by Lifecycle. LifecycleBase uses the template method design pattern to abstract out the changes and invariable points of components, delaying the initialization of different components to specific subclasses. The observer pattern is also used to publish launch event decoupling.

The specific init and start processes are shown in the swimlane diagram below: This is a note I made while reading the source code DEBUG. Readers should not be afraid to take a long time to take notes, and follow the debug slowly to record, I believe you will have a deeper understanding.

The init process

Start the process

Readers friends according to my two content, seize the main line component to debug, and then follow the swimlane diagram to read the source code, I believe that there will be harvest, and get twice the result with half the effort. In the process of reading the source code, do not enter a detail, must first abstract each component, understand the responsibilities of each component. Finally, understand the implementation details of each component after understanding the responsibilities and design philosophy of each component, never try to understand a specific leaf first.

I have identified each core class in the architecture design diagram and the swimlane diagram. “Code brother byte” will share with you how to read the source code efficiently and keep learning interest.

How to read the source code correctly

Don’t get bogged down in details and lose sight of the big picture: WHEN I stare at the leaves before I know what the forest looks like, I lose sight of the big picture and the overall design. So read the source code when learning not to start into the details, but macro view of the overall architecture design ideas, the relationship between modules.

1. Before reading the source code, you need to have a certain technical reserve

Common design patterns, for example, must be mastered, especially: ** template methods, policy patterns, singletons, factories, observers, dynamic proxies, adapters, chains of responsibility, decorators. ** You can see the history of design patterns in Codebyte to build a good foundation.

2. Must be able to use the framework/library and proficient in workarounds

The devil is in the details. If you don’t know the usage at all, you may be able to see what the code means, but not why it is written.

3. First look for books and materials to understand the overall design of the software.

Look at it from a global perspective, and god’s perspective brings out the main core architectural design, forest before leaves. What modules are there? How are modules related to each other? How is it related?

It may not be easy to understand at once, but build an overall concept, like a map, in case you get lost.

Check where you are every now and then while reading the source code. For example, “Code brother Byte” sort out the architecture design of Tomcat for everyone, and then try to follow the debug itself, which is very efficient.

4. Build the system and run the source code!

Debug is a very, very important tool. It is impossible to understand a system by just looking at it and not running it. Use the call stack wisely (observe the context of the call process).

5. Note

One very important job is note-taking (writing again!). Draw a class diagram of the system (don’t rely on the IDE to generate it for you) and record the main function calls for later review.

Documentation is extremely important because the code is too complex and the human brain has limited capacity to remember all the details. Documentation will help you remember key points so you can recall them and move on quickly.

Otherwise, what you read today may be almost forgotten by tomorrow. So friends remember to collect more to see, try to download the source code down repeatedly debugging.

Wrong way

Getting caught up in details and not seeing the big picture: STARING at the leaves before I knew what the forest looked like, I lost sight of the big picture and the overall design. So read the source code when learning not to start into the details, but macro view of the overall architecture design ideas, the relationship between modules.
Study how to design before learning how to use it: First of all, basically the framework uses design patterns, we should at least understand the commonly used design patterns, even if it is “back”, but also have a clear chest. When learning a technology, I recommend reading the official documentation first to see what modules and overall design ideas there are. Then download the sample and run through it, and finally look at the source code.
Look at the source code details: to see a specific module source code time also want to subconsciously not to go into details, it is important to learn design ideas, rather than a specific method to achieve logic. Unless they want to do secondary development based on the source code, and secondary development is based on the understanding of a framework to go into details.

Component design – Implementation of a single responsibility, interface oriented philosophy

When we receive a functional requirement, the most important thing is to abstract the design, disassemble the main core components of the function, and then find the change and invariable point of the requirement. The similar functions are converged, and the functions are coupled. At the same time, the external support can be expanded, and the internal modification can be closed. Working towards a requirement requires reasonable abstraction to abstract out the different components, rather than lumping all the functionality together into a single class or even a single method, where the code is all tied up, unscalable, difficult to maintain and read.

With that in mind, let’s look at how Tomcat designs components for wiring and container management.

Let’s see how Tomcat implements starting Tomcat, and how it accepts requests and forwards them to our Servlet.

Catalina

The main task is to create the Server, not simply to create, but to parse the server.xml file to create the meaning of the various components of the file configuration, and then call the Server init() and start() methods, the start journey starts from here… At the same time, also take into account exceptions, such as closing Tomcat also need to do elegant close startup process created resources need to be released, Tomcat is registered in the JVM a “close hook”, source CODE I have added annotations, omit some irrelevant code. Tomcat is also closed with an await() listening stop directive.

    /** * Start a new server instance. */
    public void start(a) {
				// If server is empty, parse server.xml creation
        if (getServer() == null) {
            load();
        }
				// If the creation fails, an error is reported and the startup exits
        if (getServer() == null) {
            log.fatal("Cannot start server. Server instance is not configured.");
            return;
        }

        // Start the server
        try {
            getServer().start();
        } catch (LifecycleException e) {
            log.fatal(sm.getString("catalina.serverStartFail"), e);
            try {
                // If no, run destroy to destroy the resource
                getServer().destroy();
            } catch (LifecycleException e1) {
                log.debug("destroy() failed for failed Server ", e1);
            }
            return;
        }

        // Create and register JVM closing hooks
        if (useShutdownHook) {
            if (shutdownHook == null) {
                shutdownHook = new CatalinaShutdownHook();
            }
            Runtime.getRuntime().addShutdownHook(shutdownHook);
        }
				// Listen to stop the request with the await method
        if(await) { await(); stop(); }}Copy the code

By “closing hooks”, you can do some cleanup when the JVM is shut down, such as releasing thread pools, cleaning up zero-hour files, flushing memory data to disk… .

A “closing hook” is essentially a thread that the JVM tries to execute before stopping. Let’s see what CatalinaShutdownHook does.

    /** * Shutdown hook which will perform a clean shutdown of Catalina if needed. */
    protected class CatalinaShutdownHook extends Thread {

        @Override
        public void run(a) {
            try {
                if(getServer() ! =null) {
                    Catalina.this.stop(); }}catch(Throwable ex) { ... }}/** * Close the created Server instance */
    public void stop(a) {

        try {
            // Remove the ShutdownHook first so that server.stop()
            // doesn't get invoked twice
            if(useShutdownHook) { Runtime.getRuntime().removeShutdownHook(shutdownHook); }}catch (Throwable t) {
            ......
        }

        / / close the Server
        try {
            Server s = getServer();
            LifecycleState state = s.getState();
           // Check whether it is closed. If it is closed, no operation is performed
            if (LifecycleState.STOPPING_PREP.compareTo(state) <= 0
                    && LifecycleState.DESTROYED.compareTo(state) >= 0) {
                // Nothing to do. stop() was already called
            } else{ s.stop(); s.destroy(); }}catch (LifecycleException e) {
            log.error("Catalina.stop", e); }}Copy the code

The stop method of the Server releases and cleans up all resources.

The Server component

To experience the beauty of interface design, see how Tomcat design components and interfaces, abstract Server components, Server components need Lifecycle management, so inherit Lifecycle implementation of one-click start and stop.

Lifecycle is a component initialization, start, stop, destroy, listener management and maintenance. In fact, it is the design of the observer mode. When different events are triggered, events are issued to the listener to perform different business processes. This is the design philosophy of decoupling.

Server authoring is responsible for managing Service components.

And then, let’s look at the implementation class of the Server component which is StandardServer and what does StandardServer do and what classes are associated with it?

In the process of reading the source code, we must pay more attention to the interface and abstract class, interface is the global design of the component abstraction; Abstract class is basically the use of template method mode, the main purpose is to abstract the whole algorithm process, the change point to the subclass, the invariant point code reuse.

StandardServer inherits LifeCycleBase, its lifecycle is centrally managed, and its child components are Services, so it also needs to manage the lifecycle of the Service, that is, call the startup method of the Service component at startup, Call their stop methods when they stop. The Server maintains several Service components, which are stored in an array. How does the Server add a Service to the array?

    /** * Add a Service to the array **@param service The Service to be added
     */
    @Override
    public void addService(Service service) {

        service.setServer(this);

        synchronized (servicesLock) {
           // Create a results array of services.length + 1
            Service results[] = new Service[services.length + 1];
           // Copy the old data into the Results array
            System.arraycopy(services, 0, results, 0, services.length);
            results[services.length] = service;
            services = results;
						// Start the Service component
            if (getState().isAvailable()) {
                try {
                    service.start();
                } catch (LifecycleException e) {
                    // Ignore}}// The observer mode is used to trigger the listener event
            support.firePropertyChange("service".null, service); }}Copy the code

As can be seen from the above code, it is not to allocate a very long array at the beginning, but to dynamically expand the length during the new process, which is to save space, for our usual development is not the main memory loss caused by space complexity, the pursuit of extreme beauty.

Another important feature is that the last line of the Caralina start method above calls the await method of the Server.

“Await” method creates a Socket to listen to port 8005 and receives connection requests from the Socket in an infinite loop. If a new connection arrives, it will establish the connection and then read data from the Socket. If the data read is the stop command “SHUTDOWN”, exit the loop and enter the stop process.

Service

As with interface oriented design, the implementation class for the Service component is StandardService. The Service component still inherits the Lifecycle management Lifecycle, and there is no longer a burden to show the graphical diagram. Let’s start by looking at the main methods and member variables defined by the Service interface. Through the interface we can know the core function, when reading the source code must pay more attention to the relationship between each interface, do not rush into the implementation of the class.

public interface Service extends Lifecycle {

  // ---------- primary member variable

    // The top-level container contained by the Service component Engine
    public Engine getContainer(a);

    // Set the Engine container for the Service
    public void setContainer(Engine engine);

    // The Server component to which the Service belongs
    public Server getServer(a);

    // --------------------------------------------------------- Public Methods

   // Add the connector associated with the Service
    public void addConnector(Connector connector);

    public Connector[] findConnectors();

   // Custom thread pool
    public void addExecutor(Executor ex);

   Mapper is used to locate the component processing of a request
    Mapper getMapper(a);
}

Copy the code

Let’s take a closer look at the Service implementation class:

public class StandardService extends LifecycleBase implements Service {
    / / name
    private String name = null;

    / / Server instance
    private Server server = null;

    // Array of connectors
    protected Connector connectors[] = new Connector[0];
    private final Object connectorsLock = new Object();

    // Corresponding Engine container
    private Engine engine = null;

    // The mapper and its listener are again the application of observer mode
    protected final Mapper mapper = new Mapper();
    protected final MapperListener mapperListener = new MapperListener(this);
}
Copy the code

StandardService inherits the LifecycleBase abstract class, which defines three final template method definition lifecycles, each of which defines the point of change in the abstract method to allow different components to time their own processes. This is where we learn to abstract change and immutability using a template approach.

There are also familiar components in StandardService, such as Server, Connector, Engine, and Mapper.

So why is there a MapperListener? This is because Tomcat supports hot deployment. When the deployment of a Web application changes, the mapping information in the Mapper also changes. The MapperListener is a listener that listens for changes in the container and updates the information to the Mapper. The downstream service performs different processing according to the actions of multiple upstream services. This is the application scenario of the observer mode. An event is triggered by multiple listeners, and the event publisher does not need to invoke all downstream services, but decouples through the observer mode trigger.

The Service manages the connector as well as the Engine top-level container, so proceed to its startInternal method, which is an abstract method defined by the LifecycleBase template. See how he starts each component sequence.

protected void startInternal(a) throws LifecycleException {

    //1. Trigger start listener
    setState(LifecycleState.STARTING);

    //2. Start Engine first, Engine will start its child container, because the combined mode is used, so each layer will start its child container first.
    if(engine ! =null) {
        synchronized(engine) { engine.start(); }}//3. Start the Mapper listener again
    mapperListener.start();

    //4. Finally start the connector, which starts its child components, such as Endpoint
    synchronized (connectorsLock) {
        for (Connector connector: connectors) {
            if(connector.getState() ! = LifecycleState.FAILED) { connector.start(); }}}}Copy the code

The Service starts the Engine component, then the Mapper listener, and finally the connector. This makes sense, because the inner component is started before it can provide services externally, and the outer connector component can be started. Mapper also relies on container components to listen for changes when container components are started, so Mapper and MapperListener start after container components. The order in which components stop is the reverse of the order in which they start, also based on their dependencies.

Engine

As the top component of Container, Engine is essentially a Container, inheriting ContainerBase, and again using the template method design pattern for abstract classes. ContainerBase uses a HashMap

children = new HashMap<>(); Member variables hold child containers for each component. Protected final Pipeline = new StandardPipeline(this); A Pipeline consists of a Pipeline that processes requests from a connector. The responsibility chain pattern builds the Pipeline.
,>

 public class StandardEngine extends ContainerBase implements Engine {}Copy the code

Engine’s child container is Host, so children holds Host.

Let’s see what ContainerBase does…

InitInternal defines container initialization and creates a thread pool dedicated to starting and stopping containers.
StartInternal: The default implementation of container start, using composite mode to build container parent-child relationship, first get its own child container, startStopExecutor to start the child container.

public abstract class ContainerBase extends LifecycleMBeanBase
        implements Container {

   // Provides default initialization logic
    @Override
    protected void initInternal(a) throws LifecycleException {
        BlockingQueue<Runnable> startStopQueue = new LinkedBlockingQueue<>();
       // Create a thread pool to start or stop the container
        startStopExecutor = new ThreadPoolExecutor(
                getStartStopThreadsInternal(),
                getStartStopThreadsInternal(), 10, TimeUnit.SECONDS,
                startStopQueue,
                new StartStopThreadFactory(getName() + "-startStop-"));
        startStopExecutor.allowCoreThreadTimeOut(true);
        super.initInternal();
    }

  // The container starts
    @Override
    protected synchronized void startInternal(a) throws LifecycleException {

        // Get the child container and submit it to the thread pool for startup
        Container children[] = findChildren();
        List<Future<Void>> results = new ArrayList<>();
        for (Container child : children) {
            results.add(startStopExecutor.submit(new StartChild(child)));
        }
        MultiThrowable multiThrowable = null;
        // Get the startup result
        for (Future<Void> result : results) {
            try {
                result.get();
            } catch (Throwable e) {
                log.error(sm.getString("containerBase.threadedStartFailed"), e);
                if (multiThrowable == null) {
                    multiThrowable = newMultiThrowable(); } multiThrowable.add(e); }}...// Start the pipeline pipeline, which processes the request passed by the connector
        if (pipeline instanceof Lifecycle) {
            ((Lifecycle) pipeline).start();
        }
				 // Publish startup events
        setState(LifecycleState.STARTING);
        // Start our threadthreadStart(); }}Copy the code

LifecycleMBeanBase inherits LifecycleMBeanBase, which also implements lifecycle management, provides the default startup mode for child containers, and provides CRUD functionality for child containers.

The Engine uses ContainerBase’s startInternal method to start the Host container. What else does Engine do by itself?

Let’s look at the constructor. Pipeline sets setBasic and creates StandardEngineValve.

/** * Create a new StandardEngine component with the default basic Valve. */
    public StandardEngine(a) {

        super(a); pipeline.setBasic(newStandardEngineValve()); . }Copy the code

The main function of a container is to process requests and forward them to a Host child container for processing, which is implemented by Valve. Each container component has a Pipeline that forms a chain of responsibility to deliver requests. There is a Basic Valve in Pipeline, and the Basic Valve in Engine is defined as follows:

final class StandardEngineValve extends ValveBase {
    @Override
    public final void invoke(Request request, Response response)
        throws IOException, ServletException {

        // Select an appropriate Host to process the request. The Mapper component obtains the appropriate Host
        Host host = request.getHost();
        if (host == null) {
            response.sendError
                (HttpServletResponse.SC_BAD_REQUEST,
                 sm.getString("standardEngine.noHost",
                              request.getServerName()));
            return;
        }
        if (request.isAsyncSupported()) {
            request.setAsyncSupported(host.getPipeline().isAsyncSupported());
        }

        // Get the Pipeline first Valve for the Host container and forward the request to Host
        host.getPipeline().getFirst().invoke(request, response);
}
Copy the code

The basic valve implementation is very simple and simply forwards requests to the Host container. The Host container object that handles the request is retrieved from the request. How can the request object have a Host container? This is because the Mapper component routes the request to the Engine container before it reaches the Engine container. The Mapper component locates the corresponding container by the request URL and stores the container object in the request object.

Component Design Summary

Have you noticed that the design of Tomcat is almost interface oriented, that is, the design of interface isolation function is actually the embodiment of a single responsibility. Each interface abstracts different components and defines the common execution process of components through abstract classes. The meaning of the word single responsibility is actually reflected here. During the analysis, we saw the observer pattern, the template method pattern, the composition pattern, the chain of responsibility pattern, and the design philosophy of how to abstract components for interface design.

I/O model and thread pool design for connectors

The connector accepts TCP/IP connections, limits the number of connections, reads data, and forwards requests to containers. Therefore, I/O programming must be involved here. Today, I will take you to analyze how Tomcat uses the I/O model to achieve high concurrency, and enter the world of I/O together.

There are five main I/O models: synchronous blocking, synchronous non-blocking, I/O multiplexing, signal-driven, and asynchronous I/O. Are they familiar but too dumb to tell the difference?

I/O is the process of copying data between the computer’s memory and external devices.

The CPU reads data from external devices into memory and then processes it. Consider this scenario: when a program sends a read instruction from the CPU to an external device, it takes some time for data to be copied from the external device to memory. At this point, the CPU has nothing to do, and the program actively cedes the CPU to someone else. Let the CPU keep checking: has the data arrived? Has the data arrived?

That’s what the I/O model is all about. Today I’ll start by talking about the differences between the various I/O models, and then focus on how Tomcat’s NioEndpoint component implements the non-blocking I/O model.

I/O model

A network I/O communication process, such as a network data read, involves two objects, the user thread calling the I/O operation and the operating system kernel. The address space of a process is divided into user space and kernel space. User threads cannot directly access the kernel space.

There are two main steps in network reading:

The user thread waits for the kernel to copy data from the nic to the kernel space.
The kernel copies data from kernel space to user space.

The same goes for sending data to the network, copying data from the user thread to the kernel space, which copies the data to the network card for sending.

Differences between DIFFERENT I/O models: The two steps are implemented in different ways.

In the case of synchronization, whether an application calls a method and returns immediately, without waiting.
For blocking and non-blocking: the main question is whether data is copied from the kernel to user space or not.

Synchronously block I/O

When the user thread makes a read call, the thread blocks and has to release the CPU, while the kernel waits for the nic data to arrive and copies the data from the NIC to the kernel. When the kernel copies the data to the user and wakes up the reader thread, both steps are blocked.

Synchronous nonblocking

The user thread keeps calling the read method and returns a failure if the data has not been copied to the kernel until the data reaches the kernel space. The user thread is blocked while waiting for data to be copied from kernel space to user space, and is woken up when data reaches user space. The loop calls the read method without blocking.

I/O multiplexing

The user thread’s read operation is divided into two steps:

The user line initiates firstselectCall, the main is to ask the kernel data ready? Step 2 is performed when the kernel has the data ready.
The user thread initiates againreadCall, while waiting for the kernel to copy data from kernel space to user space, the start read thread is blocked.

A single SELECT call can query the state of multiple ** data channels from the kernel, so it is called multiplexing.

Asynchronous I/O

The user thread registers a callback function when it makes a read call. The read call returns immediately without blocking, and waits for the kernel to prepare the data before calling the newly registered callback function to process the data. The user thread does not block the entire time.

Tomcat NioEndpoint

Tomcat’s NioEndpoit component actually implements the I/O multiplexing model because of this concurrency capability. Let’s take a peek at how Tomcat NioEndpoint is designed.

For the use of Java multiplexers, there are no more than two steps:

Create a Seletor, register various events of interest with it, then call the SELECT method and wait for the interesting thing to happen.
When something of interest happens, such as reading, a new thread is created to read data from the Channel.

Tomcat’s NioEndpoint component is complex to implement, but the basic principles are the above two steps. The LimitLatch, Acceptor, Poller, SocketProcessor, and Executor components work as shown in the following figure:

Due to the use of I/O multiplexing, Poller essentially holds the I/O time of a Java Selector to detect a channel. When the data is ready to be read or written, a SocketProcessor task is created and thrown into the thread pool. In other words, a small number of threads listen for read or write events. The dedicated thread pool then performs reads and writes, improving performance.

Custom thread pool model

In order to improve processing power and concurrency, Web containers usually leave the processing of requests to the thread pool. Tomcat extends Java’s native thread pool to improve concurrency requirements. Before getting into Tomcat thread pool principles, let’s review Java thread pool principles.

Java thread pool

In simple terms, a Java thread pool maintains an array of threads and a queue of tasks. When a task becomes overloaded, it is put into a queue and processed slowly.

ThreadPoolExecutor

To peek into the constructor of the thread pool core class, we need to understand what each parameter does to understand how a thread pool works.

    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {... }Copy the code

CorePoolSize: Number of threads that remain in the pool, even if they are idle, unless allowCoreThreadTimeOut is set.
MaximumPoolSize: maximum number of threads allowed in the pool when the queue is full.
KeepAliveTime, TimeUnit: If the number of threads is greater than the number of cores, the maximum hold time of extra idle threads is destroyed. Unit is the unit of keepAliveTime. When settingallowCoreThreadTimeOut(true), the thread pool corePoolSize range of idle time up to keepAliveTime will also be reclaimed.
WorkQueue: When the number of threads reaches corePoolSize, the new tasks are put into the workQueue and the threads in the thread pool try to pull work from the workQueue by calling the poll method.
ThreadFactory: Factory for creating threads, such as background threads, thread names, etc.
RejectedExecutionHandler: RejectedExecutionHandler: The handler executes the rejection policy because it has reached the thread limit and queue capacity. You can also customize the rejection policy as long as it is implementedRejectedExecutionHandlerCan. Default rejection policy:AbortPolicyReject the task and throw itRejectedExecutionExceptionThe exception;CallerRunsPolicyThe thread that submitted the task executes; ` `

To analyze the relationship between each parameter:

When a new task is submitted, if the number of threads in the pool is less than corePoolSize, a new thread pool is created to execute the task. When the number of threads in the pool is equal to corePoolSize, the new task is placed in the workQueue.

If the workQueue is full and the current number of threads is less than maximumPoolSize, a temporary thread is created to execute the task. If the total number of threads exceeds maximumPoolSize, no more threads are created and a denial policy is implemented. DiscardPolicy discards tasks by doing nothing. DiscardOldestPolicy Discards the oldest unprocessed program.

The specific execution process is shown in the figure below:

Tomcat thread pool

Custom version of the ThreadPoolExecutor, inherited Java. Util. Concurrent. ThreadPoolExecutor. There are two key parameters for thread pools:

Number of threads.
Queue length.

Tomcat must be limited to two parameters or risk running out of CPU and memory resources in high concurrency scenarios. Inherited with Java. Util. Concurrent. The same ThreadPoolExecutor, but more efficient.

Its construction method is as follows, as Java official

public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, RejectedExecutionHandler handler) {
        super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue, handler);
        prestartAllCoreThreads();
    }
Copy the code

The component that controls the thread pool in Tomcat is StandardThreadExecutor, which also implements the lifecycle interface. Here is the code to start the thread pool

    @Override
    protected void startInternal(a) throws LifecycleException {
        // Customize the task queue
        taskqueue = new TaskQueue(maxQueueSize);
        // Custom thread factory
        TaskThreadFactory tf = new TaskThreadFactory(namePrefix,daemon,getThreadPriority());
       // Create custom thread pools
        executor = new ThreadPoolExecutor(getMinSpareThreads(), getMaxThreads(), maxIdleTime, TimeUnit.MILLISECONDS,taskqueue, tf);
        executor.setThreadRenewalDelay(threadRenewalDelay);
        if (prestartminSpareThreads) {
            executor.prestartAllCoreThreads();
        }
        taskqueue.setParent(executor);
        // In observer mode, publish startup events
        setState(LifecycleState.STARTING);
    }
Copy the code

The key points are:

Tomcat has its own custom version of the task queue and thread factory, and can limit the length of the task queue, which is maxQueueSize.
Tomcat also limits the number of threads, setting the number of core threads (minSpareThreads) and the maximum number of thread pools (maxThreads).

In addition, Tomcat has redefined its own thread pool process based on the original process described above.

Before corePoolSize tasks, a task to create a new thread.
There are also tasks that are submitted, put directly to the queue, and when the queue is full, but the maximum thread pool is not reached, create temporary threads to fight the fire.
When the number of thread buses reaches maximumPoolSize, the reject policy is executed directly.

The Tomcat thread pool extends the native ThreadPoolExecutor by overwriting the execute method to implement its own task handling logic:

Before corePoolSize tasks, a task to create a new thread.
There are also tasks that are submitted, put directly to the queue, and when the queue is full, but the maximum thread pool is not reached, create temporary threads to fight the fire.
When the number of thread buses reaches maximumPoolSize, continue trying to put the task on the queue. If the queue is also full, the insert task fails and the reject policy is executed.

The biggest difference is that Tomcat does not execute the reject policy immediately when the number of threads reaches the maximum. Instead, it tries to add tasks to the task queue again and then executes the reject policy after the adding fails.

The code looks like this:

    public void execute(Runnable command, long timeout, TimeUnit unit) {
       // Record the number of submitted tasks +1
        submittedCount.incrementAndGet();
        try {
            // Call the Java native thread pool to execute the task when the native throws the reject policy
            super.execute(command);
        } catch (RejectedExecutionException rx) {
          // When the total number of threads reaches maximumPoolSize, Java natively implements the reject policy
            if (super.getQueue() instanceof TaskQueue) {
                final TaskQueue queue = (TaskQueue)super.getQueue();
                try {
                    // Try to queue the task
                    if(! queue.force(command, timeout, unit)) { submittedCount.decrementAndGet();// The queue is still full
                        throw new RejectedExecutionException("Queue capacity is full."); }}catch (InterruptedException x) {
                    submittedCount.decrementAndGet();
                    throw newRejectedExecutionException(x); }}else {
              // Submit assignment 1
                submittedCount.decrementAndGet();
                throwrx; }}}Copy the code

The Tomcat thread pool uses a submittedCount to maintain tasks that have been submitted to the thread pool, which is related to the custom version of Tomcat’s task queue. Tomcat’s TaskQueue extends the LinkedBlockingQueue in Java, and we know that LinkedBlockingQueue is unlimited in length by default unless given a capacity. So Tomcat gives it capacity. The constructor of TaskQueue takes capacity as an integer. TaskQueue passes Capacity to the constructor of its parent LinkedBlockingQueue. Prevents memory overflow from adding tasks indefinitely. And the default is unlimited, which causes the thread pool to add a task to the task queue when the current thread count reaches the core thread count, and it always succeeds, so you never have a chance to create a new thread.

To solve this problem, TaskQueue overwrites the Offer method of LinkedBlockingQueue to return false when appropriate. False indicates that the task has failed to be added, at which point a new thread is created in the thread pool.

public class TaskQueue extends LinkedBlockingQueue<Runnable> {...@Override
  // The number of current threads must already be greater than the number of core threads when the thread pool calls the task queue method
  public boolean offer(Runnable o) {

      // If the number of threads reaches the maximum, no new threads can be created. Tasks can only be added to the queue.
      if (parent.getPoolSize() == parent.getMaximumPoolSize())
          return super.offer(o);

      // The current number of threads is greater than the number of core threads and smaller than the maximum number of threads.
      // New threads can be created. There are two cases:

      //1. If the number of submitted tasks is smaller than the number of current threads, there are idle threads. No need to create new threads
      if (parent.getSubmittedCount()<=(parent.getPoolSize()))
          return super.offer(o);

      //2. If the number of submitted tasks is greater than the number of current threads, there are not enough threads, return false to create a new thread
      if (parent.getPoolSize()<parent.getMaximumPoolSize())
          return false;

      // Tasks are always added to the task queue by default
      return super.offer(o); }}Copy the code

A new thread will be created only if the number of current threads is greater than the number of core threads, less than the maximum number of threads, and if the number of submitted tasks is greater than the number of current threads, that is, there are not enough threads, but the number of threads has not reached the limit. This is why Tomcat needs to maintain the number of submitted tasks variable, which is designed to give the thread pool the opportunity to create new threads if the length of the task queue is unlimited. You can limit the length of the task queue by setting the maxQueueSize parameter.

Performance optimization

Thread pool tuning

Closely related to the I/O model are thread pools, and thread pool tuning is about setting reasonable thread pool parameters. Let’s start by looking at the key parameters in the Tomcat thread pool:

parameter	details
threadPriority	Thread priority. Default is 5
daemon	Background thread. Default is true
namePrefix	Thread name prefix
maxThreads	Maximum number of threads. Default: 200
minSpareThreads	Minimum number of threads (idle for more than a certain amount of time will be recycled). Default is 25
maxIdleTime	The maximum idle time of a thread will be collected until only minSpareThreads are created. The default value is 1 minute
maxQueueSize	Maximum length of a task queue
prestartAllCoreThreads	MinSpareThreads specifies whether to create minSpareThreads when the thread pool is started

If this parameter is set too small, Tomcat will become thread hungry and the request will be queued up for processing, resulting in a long response time. If the maxThreads parameter is too large, this can also be a problem because the server has a limited number of CPU cores, and too many threads can cause threads to switch back and forth across the CPU, with significant switching overhead.

Thread I/O time versus CPU time

Here we have another formula for calculating the number of thread pools, assuming the server is single-core:

Thread pool size = (thread I/O blocking time + thread CPU time)/thread CPU time

Thread I/O blocking time + thread CPU time = average request processing time.

Tomcat memory overflow cause analysis and tuning

JVM thrown in Java. Lang. OutOfMemoryError when, in addition to print out a line description information, it will also print a stack trace, so we can use this information to find the cause of abnormal. Before looking for the cause, let’s take a look at what can cause OutofMemoryErrors, with memory leaks being a common cause of OutofMemoryErrors.

In fact, tuning is often used to find bottlenecks in the system. If there is a situation where the system response is slow, but CPU usage is not high, memory is increased, and Heap Dump analysis shows that a large number of requests are piled up in the queue of the thread pool, what should be done in this situation? The request may take too long to process. Check to see if there is a delay in accessing the database or external application.

java.lang.OutOfMemoryError: Java heap space

This exception is thrown when the JVM is unable to allocate objects in the heap for the following reasons:

Memory leak: objects that should be recycled cannot be recycled because the program keeps holding references, such as ThreadLocal, object pool, or memory pool in thread pools. In order to find the memory leak, we use jMAP tool to generate Heap Dump, and then use MAT analysis to find the memory leak. jmap -dump:live,format=b,file=filename.bin pid
Out of memory: We set the heap size to be insufficient for the application and modify the JVM parameters to adjust the heap size, such as -xMS256m -XMx2048m.
Overuse of finalize method. If we want to perform some logic, such as cleaning up the resources held by objects, before Java class instances are GC, we can define finalize methods in Java classes so that JVM GC does not immediately reclaim those object instances. But will be added to an object instance is called “Java. Lang. Ref. Finalizers. ReferenceQueue” queue, perform the object’s finalize method, after the recovery of these objects. The Finalizer thread will compete with the main thread for CPU resources, but due to its low priority, the processing speed is not as fast as the main thread can create objects, so more and more objects in the ReferenceQueue queue will eventually throw OutOfMemoryError. The solution is to try not to define finalize methods for Java classes.

java.lang.OutOfMemoryError: GC overhead limit exceeded

The garbage collector runs continuously, but inefficiently and hardly reclaims memory. An OutOfMemoryError is thrown if a Java process spends more than 96% of its CPU time on a single GC, but reclaims less than 3% of the JVM heap for five GCS in a row.

The IDE solution to this problem is to either look at GC logs or generate Heap dumps to see if memory is running out, or to increase the Heap size if not. GC logs can be printed with the following JVM startup parameters:

-verbose:gc // Outputs GC information on the console. -xx :+PrintGCDetails // Outputs gc information on the console. -xloggc: filepath // Outputs GC logs to a specified fileCopy the code

For example, you can use Java -verbose: gc-xloggc :gc.log -xx :+PrintGCDetails -jar xxx.jar to record GC logs. You can use GCViewer to view GC logs. Analyze the garbage collection by opening the generated gC.log with GCViewer.

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

This exception is thrown because “the requested array size exceeds the JVM limit” and the application tries to allocate a very large array. For example, if a program tries to allocate a 128 MB array, but the heap size is 100 MB, this is usually a configuration problem. It is possible that the JVM heap Settings are too small, or there is a bug in the program.

java.lang.OutOfMemoryError: MetaSpace

The amount of JVM meta-space is allocated in local memory, but its size is limited by the MaxMetaSpaceSize parameter. When the meta-space size exceeds MaxMetaSpaceSize, the JVM throws an OutOfMemoryError with the word MetaSpace. The solution is to increase the value of the MaxMetaSpaceSize parameter.

java.lang.OutOfMemoryError: Request size bytes for reason. Out of swap space

The Java HotSpot VM code throws this exception when the local heap memory allocation fails or the local memory is running out, and the VM triggers the Fatal Error Handling mechanism, which generates a fatal error log file containing useful information about the thread, process, and operating system at the time of the crash. If you encounter an OutOfMemoryError of this type, you need to diagnose it based on the error information thrown by the JVM; Or use the operating system-supplied DTrace tool to track system calls and see what program code is constantly allocating local memory.

java.lang.OutOfMemoryError: Unable to create native threads

The Java program asks the JVM to create a new Java thread.
JVM Native Code agents this request by calling the operating system API to create an operating system-level Thread Native Thread.
When the operating system tries to create a new Native Thread, it needs to allocate some memory to this Thread at the same time. Each Native Thread has a Thread stack, and the size of the Thread stack is determined by the JVM parameter-XssDecision.
The operating system may fail to create a new thread for a variety of reasons, as discussed below.
The JVM throws “Java. Lang. OutOfMemoryError: Unable to create new native thread” error.

This is just an overview of the scenario, and the online investigation of production will be introduced in the future. Due to the limited space, it will not be expanded. Pay attention to “code elder brother byte” to give you hard goods to chew!

conclusion

Review the Tomcat summary architecture design and break down in detail how Tomcat handles high concurrency connection design. And shared how to efficiently read the open source framework source ideas, design patterns, concurrent programming is the most important, readers can read the history of “code brother byte” history article learning.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Tomcat high concurrency source code unpacking and performance tuning

High concurrency disassembly core preparation

Design patterns

I/O model

Java concurrent programming

Overall Architecture of Tomcat

Startup process: What happened to the startup.sh script

How to read the source code correctly

Wrong way

Component design – Implementation of a single responsibility, interface oriented philosophy

Catalina

The Server component

Service

Engine

Component Design Summary

I/O model and thread pool design for connectors

I/O model

Synchronously block I/O

Synchronous nonblocking

I/O multiplexing

Asynchronous I/O

Tomcat NioEndpoint

Custom thread pool model

Java thread pool

ThreadPoolExecutor

Tomcat thread pool

Performance optimization

Thread pool tuning

Tomcat memory overflow cause analysis and tuning

java.lang.OutOfMemoryError: Java heap space

java.lang.OutOfMemoryError: GC overhead limit exceeded

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

java.lang.OutOfMemoryError: MetaSpace

java.lang.OutOfMemoryError: Request size bytes for reason. Out of swap space

java.lang.OutOfMemoryError: Unable to create native threads

conclusion

Recommended reading

Tomcat high concurrency source code unpacking and performance tuning

High concurrency disassembly core preparation

Design patterns

I/O model

Java concurrent programming

Overall Architecture of Tomcat

Startup process: What happened to the startup.sh script

How to read the source code correctly

Wrong way

Component design – Implementation of a single responsibility, interface oriented philosophy

Catalina

The Server component

Service

Engine

Component Design Summary

I/O model and thread pool design for connectors

I/O model

Synchronously block I/O

Synchronous nonblocking

I/O multiplexing

Asynchronous I/O

Tomcat NioEndpoint

Custom thread pool model

Java thread pool

ThreadPoolExecutor

Tomcat thread pool

Performance optimization

Thread pool tuning

Tomcat memory overflow cause analysis and tuning

java.lang.OutOfMemoryError: Java heap space

java.lang.OutOfMemoryError: GC overhead limit exceeded

java.lang.OutOfMemoryError: Requested array size exceeds VM limit

java.lang.OutOfMemoryError: MetaSpace

java.lang.OutOfMemoryError: Request size bytes for reason. Out of swap space

java.lang.OutOfMemoryError: Unable to create native threads

conclusion

Recommended reading

Related Posts

Statistical learning methods – Perceptron generalization and supplement

The Apscheduler module in Python implements scheduled tasks

The perennial HashMap loop