Some thoughts on Spring-WebFlux

Is Spring Webflux a success now? Other answers are also very exciting, if you are interested can check

At present, the synchronous microservice based on Spring Web has a very big defect: compared with the asynchronous microservice based on Spring – Webflux, the synchronous microservice based on Spring – Web does not deal with the client request timeout configuration well. When the client request times out, the client will directly return a timeout exception, but the invoked server task is not cancelled in the spring-Web based synchronous microservice, but will be cancelled in the Spring-Webflux based asynchronous microservice. Currently, there is no good way to cancel these timed tasks in a synchronous environment.

The biggest problem with Spring-Weflux is that many frameworks, including the JDK itself, have many Thread-based contexts, such as Thread Local. There is also an MDC implementation of the Java Log framework, which is generally based on ThreadLocal maps. Then there are distributed locks like Redisson’s, which make each thread generate a unique ID and bind to the thread, which is checked when unlocked. However, this design is not compatible with the Spring-Webflux Context. Spring Cloud Sleuth adds a link information context to Spring-webflux and maintains it. It also has a number of bugs and missing points.

Spring Cloud Gateway does not have link information. I am stupid.
Spring Cloud Gateway does not have link information, I am stupid (middle)
Spring Cloud Gateway does not have link information. I am stupid.

Another complication is that it is incompatible with existing blocking lock designs because reactive programming requires non-blocking. This requires recomposing the queue sort consumption to solve the concurrency competition, which is also a lot of work.

Then there is the official database IO library, not NIO. Whether it’s Java’s own Future framework, Spring WebFlux, or vert. x, they’re all non-blocking Ractor model-based frameworks (the latter two are both implemented using Netty). In blocking programming, every request needs to be handled by a thread, and if I/O is blocked, that thread will be blocked. But in non-blocking programming, based on reactive programming, threads are not blocked and can handle other requests. As a simple example, if there is only one thread pool, when the request comes in, the thread pool needs to read the DATABASE IO. This IO is NIO non-blocking IO. Then write the request data to the database connection and return it directly. Then the database returns the data, and the link’s Selector is ready with a Read event, and the data processing (equivalent to a callback) is Read through the thread pool, not necessarily on the same thread as before. That way, instead of waiting for the database to return, the thread can process other requests directly. In this case, even if the SQL execution of one service takes a long time, the execution of other services will not be affected. The foundation of all this, however, is that IO must be non-blocking IO, known as NIO (or AIO). Official JDBC does not have NIO, only BIO implementation. Instead of having the thread write the request to the link and return directly, it must wait for the response. The solution, however, is to use another thread pool to process the database request and wait for the callback to come back. In this case, business thread pool A passes the database BIO request to thread pool B for processing, reads the data, and then passes the rest of the business logic to A. So A doesn’t have to block and can handle other requests. However, there is still A case where all threads of B are blocked and the queue is full and requests of A are blocked because the execution of A business SQL takes A long time. This is not A perfect implementation. To be truly perfect, you need JDBC to implement NIO. Of course, there are other asynchronous, responsive, three-party libraries that can be used, but unofficially, compatibility, and use restrictions can be tricky.

Finally, there’s Java’s own Project Loom, which I’ve explored briefly:

JEP series 3 – Non-blocking principles for synchronizing network IO using virtual threads

In a nutshell: The Java synchronous networking API running in a virtual thread switches the underlying native Socket to non-blocking mode. When Java code enables an I/O request and the request does not complete immediately (native socket returns EAGAIN – “not ready “/” will block”), the underlying socket is registered with an internal JVM event notification mechanism (Poller). And virtual threads will be parked. When the underlying I/O operation is ready (and there are events that go to Poller), the virtual thread will be unparked and the underlying Socket operation will be reparked. The Synchronous Java Network API has been re-implemented, with related JEPS including JEP 353 and JEP 373. I/O operations that cannot be immediately completed while running in a virtual thread will result in the virtual thread being parked. When I/O is ready, the virtual thread will be unparked. This implementation is much simpler to use than the current asynchronous non-blocking I/O implementation code, hiding many implementation details that the business doesn’t care about.

Project Loom solves the primary network IO blocking problem and implements fiber with little change to existing code, non-blocking code in a blocking code style (and is compatible with current Thread-based context frameworks). Many of the new features in Java 17 are paving the way for the launch of Project Loom. Check out Nicolai and Goetz’s video, which starts at 19:17:

Nicolai and Goetz on Java AMA

Brian Goetz: “I think Project Loom is going to kill Reactive Programming”

However, the expectation is that existing threads and thread pools can be replaced with virtual threads only with code like the following:

Thread thread = Thread.ofVirtual().name("duke").unstarted(runnable);
ThreadFactory factory = Thread.ofVirtual().factory();

ExecutorService b = Executors.newVirtualThreadPool();
Copy the code

But there are still many problems to solve:

Threadlocal-related classes, since virtual threads can generate without limit, ThreadLocal generation also needs to be redesigned: The first is that many frameworks in the JDK are based on Probe implementations of ThreadLocal, such as the initial Seed of ThreadLocalRandom. The second is that the use of ThreadLocal can lead to increased GC pressure because virtual threads can be generated indefinitely.
Where the actual thread is still blocked: blocking in the synchronization lock will still block the actual thread, file IO, etc.
Since these changes change the root of some of the JDK’s frameworks, it may be difficult to discover the details of the bugs and security issues by using only unit testing and pressure testing before going online.

However, Java now paves the way for Project Loom, for example:

The Reimplement the Legacy Socket API in Java 13 and the Reimplement the Legacy DatagramSocket API in Java 15 are also designed to optimize Project Loom Compatibility with network IO
JEP 416 in Java 18: Reimplement Core Reflection with Method Handles Reimplement Reflection to reduce calls to Native stack frames by Loom virtual threads (because virtual threads can be very large, If everyone accesses the native thread stack, there is a serious performance problem.
The JEP 418: Internet-Address Resolution SPI in Java 18 is designed to solve the problem of blocking the real load thread of the virtual thread during DNS Resolution
Others include JEP Draft: Scope Locals to normalize local variables (such as ThreadLocal), which is also part of the goal for Loom implementation

Wechat search “my programming meow” public account, a daily brush, easy to improve skills, won a variety of offers:

Related Posts

Here are some Python code tricks you probably didn’t know about

MongoDB cluster failover practices

Use INSTALL_MOD_STRIP to strip drivers at modules_install to reduce disk usage