Tips: pay attention to public number: songhua preserved egg bulletin board, receive programmer monthly salary 25K+ secrets, march into BAT essential!

The background,

HTTP is a stateless protocol, that is, each request is independent of each other. So its initial implementation was that each HTTP request would open a TCP socket connection, which would be closed when the interaction was complete.

HTTP is a full-duplex protocol, so establishing and disconnecting a connection requires three handshakes and four waves. Obviously in this design, each Http request consumes a lot of additional resources, namely connection establishment and destruction.

Therefore, HTTP protocol has also been developed, through the persistent connection method to reuse socket connections.



As can be seen from the figure:

  1. In a serial connection, each interaction opens and closes the connection
  2. In a persistent connection, the first interaction opens the connection, and the connection is not closed after the interaction, eliminating the need to establish the connection for the next interaction.

There are two implementations of persistent connections: HTTP/1.0+ keep-alive and HTTP/1.1 persistent connections.

HTTP/1.0+ keep-alive

Since 1996, many HTTP/1.0 browsers and servers have extended the protocol, known as the “keep-alive” extension.

Note that this extended protocol appears as an “experimental persistent connection” in addition to 1.0. Keep-alive is no longer in use and is not addressed in the latest HTTP/1.1 specification, although many applications continue.

Clients using HTTP/1.0 add “Connection: keep-alive “to the header, asking the server to Keep a Connection open. If the server wishes to keep the connection open, it will include the same header in the response. If the response does not contain the “Connection: keep-alive “header, the client considers that the server does not support keep-alive and closes the current Connection after sending the response packet.

The keep-alive protocol is used to complement the persistent connection between the client and server. However, there are still some problems:

  • Keep-alive is not a standard protocol in HTTP/1.0. The client must send Connection: keep-alive to activate the keep-alive Connection.
  • Proxy servers may not be able to support keep-alive because some proxies are “blind relays”, unable to understand the meaning of the header and simply forward it hop by hop. It is possible that both the client and the server remain connected, but the proxy does not accept the connected data.

HTTP/1.1 persistent connection

HTTP/1.1 replaces keep-alive with a persistent connection.

HTTP/1.1 connections are persistent by default. To explicitly Close the packet, add the Connection:Close header to the packet. In HTTP/1.1, all connections are multiplexed.

However, as with keep-alive, idle persistent connections can be closed by both clients and servers at any time. Not sending Connection:Close does not mean that the server promises to keep the Connection open forever.

How do HttpClient generate persistent connections

HttpClien uses connection pooling to manage holding connections, which can be reused over the same TCP link. HttpClient persists connections through connection pooling.

In fact, “pool” technology is a general design, its design idea is not complicated:

  1. Establish a connection when it is used for the first time
  2. At the end, the corresponding connection is not closed and returned to the pool
  3. The next connection of the same destination can obtain an available connection from the pool
  4. Periodically clean up expired connections

All connection pooling is the same idea, but we look at the HttpClient source code to focus on two points:

  • This section describes the connection pool design scheme for future reference
  • How does this correspond to the HTTP protocol, that is, the implementation of theoretical abstractions into code

4.1 Implementation of HttpClient connection pool

HttpClient’s handling of persistent connections can be summarized in the following code, which extracts the connection pool-related parts from MainClientExec and removes the rest:

public class MainClientExec implements ClientExecChain { @Override public CloseableHttpResponse execute( final HttpRoute  route, final HttpRequestWrapper request, final HttpClientContext context, final HttpExecutionAwareexecAware) throws IOException, HttpException {/ / obtain a connection from the connection manager HttpClientConnectionManager request ConnectionRequest final ConnectionRequest connRequest =  connManager.requestConnection(route, userToken); final HttpClientConnection managedConn; final int timeout = config.getConnectionRequestTimeout(); HttpClientConnection managedConn = connRequest.get(timeout > 0? Timeout: 0, TimeUnit. MILLISECONDS); / / the connection manager HttpClientConnectionManager and HttpClientConnection managed connection to a ConnectionHolder hold final ConnectionHolder connHolder =  new ConnectionHolder(this.log, this.connManager, managedConn); try { HttpResponse response;if(! Managedconn.isopen ()) {// If the currently managed connection is not in open state, EstablishRoute (proxyAuthState, managedConn, Route, Request, Context); } response = RequestExecutor.execute (Request, managedConn, context); // Use the connection reuse policy to determine whether the connection is reusableif(reuseStrategy.keepAlive(response, Context)) {/ / get validity connect final long duration = keepAliveStrategy. GetKeepAliveDuration (the response, the context); // set the connection validity period connholder.setvalidfor (duration, timeunit.milliseconds); // Mark the current connection to the reusable state connholder.markReusable (); }else {
                connHolder.markNonReusable();
            }
        }
        final HttpEntity entity = response.getEntity();
        if(entity == null || ! Entity. IsStreaming ()) {/ / to release the current connection to the pool, for the next call connHolder. ReleaseConnection ();return new HttpResponseProxy(response, null);
        } else {
            returnnew HttpResponseProxy(response, connHolder); }}Copy the code

Here we see that the connection processing in Http requests is consistent with the protocol specification, and here we will expand on the implementation.

PoolingHttpClientConnectionManager HttpClient is the default connection manager, first by requestConnection () to obtain a connection request, note that there is not connection.

public ConnectionRequest requestConnection(
            final HttpRoute route,
            final Object state) {final Future<CPoolEntry> future = this.pool.lease(route, state, null);
        return new ConnectionRequest() {
            @Override
            public boolean cancel() {
                return future.cancel(true);
            }
            @Override
            public HttpClientConnection get(
                    final long timeout,
                    final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException {
                final HttpClientConnection conn = leaseConnection(future, timeout, tunit);
                if (conn.isOpen()) {
                    final HttpHost host;
                    if(route.getProxyHost() ! = null) { host = route.getProxyHost(); }else {
                        host = route.getTargetHost();
                    }
                    final SocketConfig socketConfig = resolveSocketConfig(host);
                    conn.setSocketTimeout(socketConfig.getSoTimeout());
                }
                returnconn; }}; }Copy the code

You can see that the returned ConnectionRequest object is actually a real connection instance with a Future<CPoolEntry>, which is managed by the connection pool.

From the above code we should focus on:

  • Future<CPoolEntry> future = this.pool.lease(route, state, null)
    • How to get an asynchronous connection from the connection pool CPool, Future
  • HttpClientConnection conn = leaseConnection(future, timeout, tunit)
    • How to get a real connection HttpClientConnection via an asynchronous connection Future

4.2 the Future

To see how CPool releases a Future<CPoolEntry>, AbstractConnPool core code is as follows:

private E getPoolEntryBlocking( final T route, final Object state, final long timeout, final TimeUnit tunit, Final Future<E> Future) throws IOException, InterruptedException, TimeoutException { The current lock is a reentrant lock ReentrantLockthis. Lock. The lock (); Try {// Get the current connection pool of HttpRoute. For HttpClient, the total connection pool is the same size as the connection pool of each route. Final RouteSpecificPool<T, C, E> pool = getPool(route); E entry;for(;;) { Asserts.check(! this.isShutDown,"Connection pool shut down"); // loop endlessly to get connectionsfor(;;) {// Get a connection from the pool corresponding to route, either null or valid connection entry = pool.getFree(state); // If you get null, exit the loopif (entry == null) {
                        break; }// If you get an expired connection or a closed connection, release the resource and continue the loopif (entry.isExpired(System.currentTimeMillis())) {
                        entry.close();
                    }
                    if (entry.isClosed()) {
                        this.available.remove(entry);
                        pool.free(entry, false);
                    } else{// Exit the loop if you have a valid connectionbreak; }}// Exit with a valid connectionif(entry ! = null) { this.available.remove(entry); this.leased.add(entry); onReuse(entry);returnentry; } final int maxPerRoute = getMax(route); // The maximum number of connections per route is configurable. Final int excess = math.max (0, pool.getallocatedCount () + 1-maxperroute);if (excess > 0) {
                    for (int i = 0; i < excess; i++) {
                        final E lastUsed = pool.getLastUsed();
                        if (lastUsed == null) {
                            break; } lastUsed.close(); this.available.remove(lastUsed); pool.remove(lastUsed); }} // The number of connections in the current route pool does not reach the thresholdif(pool.getAllocatedCount() < maxPerRoute) { final int totalUsed = this.leased.size(); Final int freeCapacity = math.max (this.maxtotal-totalUsed, 0); // Determine whether the connection pool exceeds the upper limit. If so, use the LRU to clean up some connectionsif(freeCapacity > 0) {final int totalAvailable = this.available.size(); // If the number of idle connections exceeds the available space, clear the idle connectionsif (totalAvailable > freeCapacity - 1) {
                            if(! this.available.isEmpty()) { final E lastUsed = this.available.removeLast(); lastUsed.close(); final RouteSpecificPool<T, C, E> otherpool = getPool(lastUsed.getRoute()); otherpool.remove(lastUsed); Final C conn = this.connfactory.create (route); // Add this connection to the "pool" corresponding to route entry = pool.add(conn); // Add this connection to the "large pool".returnentry; }} Boolean success = = = = = = = = = = = = = = = = = = =false;
                try {
                    if (future.isCancelled()) {
                        throw new InterruptedException("Operation interrupted"); }// Add the future to the route pool and wait for pool.queue(future); // Put the future into the big connection pool and wait for this.pending.add(future); // If the semaphore notification is awaited, SUCCESS istrue
                    if(deadline ! = null) { success = this.condition.awaitUntil(deadline); }else {
                        this.condition.await();
                        success = true;
                    }
                    if (future.isCancelled()) {
                        throw new InterruptedException("Operation interrupted"); }} finally {// Remove pool.unqueue(future) from the queue; this.pending.remove(future); } // Exit the loop if the semaphore has not been notified and the current time has timed outif(! success && (deadline ! = null && deadline.getTime() <= System.currentTimeMillis())) {break; }// Throw a new TimeoutException("Timeout waiting for connection"); } finally {// Release the lock on the large connection pool this.lock.unlock(); }}Copy the code


The above code logic has several important points:

  • A connection pool has a maximum number of connections, and each route has a small connection pool and a maximum number of connections
  • When the number of large or small connection pools exceeds, some connections are released through the LRU
  • If a connection is available, it is returned to the upper layer for use
  • If no connection is available, HttpClient determines whether the current route connection pool has exceeded the maximum number of connections, creates a new connection, and adds it to the pool
  • If it reaches the upper limit, it will queue up and wait. When it reaches the semaphore, it will get the semaphore again. If it cannot wait, it will throw the timeout exception
  • ReetrantLock is used to lock connections from the thread pool to ensure thread safety

At this point, the program has either gotten a usable instance of CPoolEntry, or thrown an exception to terminate the program.

4.3 HttpClientConnection

protected HttpClientConnection leaseConnection( final Future<CPoolEntry> future, final long timeout, final TimeUnit tunit) throws InterruptedException, ExecutionException, ConnectionPoolTimeoutException { final CPoolEntry entry; Try {// Get CPoolEntry entry = future.get(timeout, tunit) from asynchronous operation Future<CPoolEntry>;if(entry == null || future.isCancelled()) { throw new InterruptedException(); } Asserts.check(entry.getConnection() ! = null,"Pool entry with no connection");
            if (this.log.isDebugEnabled()) {
                this.log.debug("Connection leased: "+ format(entry) + formatStats(entry.getRoute())); }// Get a proxy object for CPoolEntry and operate on it using the same underlying HttpClientConnectionreturn CPoolProxy.newProxy(entry);
        } catch (final TimeoutException ex) {
            throw new ConnectionPoolTimeoutException("Timeout waiting for connection from pool"); }}Copy the code


How do HttpClient reuse persistent connections?

In the previous chapter, we saw that HttpClient obtains connections from a connection pool and from the pool when they are needed.

Corresponding to the questions in Chapter 3:

  1. Establish a connection when it is used for the first time
  2. At the end, the corresponding connection is not closed and returned to the pool
  3. The next connection of the same destination can obtain an available connection from the pool
  4. Periodically clean up expired connections

We saw in Chapter 4 how HttpClient handles problems 1 and 3, so what about problem 2?

How does HttpClient determine whether a connection should be closed after it is used, or pooled for reuse? Take a look at the MainClientExec code again



Response = RequestExecutor.execute (Request, managedConn, context); // Determine whether to reuse the current connection according to the reuse policyif(reusestrategy.keepalive (response, context)) {// If the connection needs to be reused, get the connection timeout, Will be subject to the response of the timeout final long duration = keepAliveStrategy. GetKeepAliveDuration (the response, the context);if(this.log.isdebugenabled ()) {final String s; //timeout indicates the number of milliseconds. If this parameter is not set to -1, there is no timeout periodif (duration > 0) {
                            s = "for " + duration + "" + TimeUnit.MILLISECONDS;
                        } else {
                            s = "indefinitely";
                        }
                        this.log.debug("Connection can be kept alive "+ s); Connholder.setvalidfor (duration, timeunit.milliseconds); // Mark the connection as reusable connholder.markReusable (); }else{/ / the connection is marked as not reuse connHolder. MarkNonReusable (); }Copy the code


Happened as you can see, when using the connection request, have connection retry strategy to determine whether the connection to reuse, if you want to reuse will in the end to HttpClientConnectionManager into the pool.

So what is the logic of a connection reuse strategy?

public class DefaultClientConnectionReuseStrategy extends DefaultConnectionReuseStrategy { public static final DefaultClientConnectionReuseStrategy INSTANCE = new DefaultClientConnectionReuseStrategy(); @Override public boolean keepAlive(final HttpResponse response, Final HttpContext context) {// HttpRequest request = (HttpRequest) context.getAttribute(HttpCoreContext.HTTP_REQUEST);if(request ! Final Header[] connHeaders = request.getheanders (HttpHeaders. Connection);if(connHeaders.length ! = 0) { final TokenIterator ti = new BasicTokenIterator(new BasicHeaderIterator(connHeaders, null));while(ti.hasNext()) {final String token = ti.nextToken(); // If the Connection:Close header is included, the request does not intend to remain connected and the response will be ignored. This header is the HTTP/1.1 specificationif (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) {
                        return false; }}}} // Use the reuse strategy of the parent classreturnsuper.keepAlive(response, context); }}Copy the code

Take a look at the parent class reuse strategy

            if(canResponseHaveBody(request, response)) { final Header[] clhs = response.getHeaders(HTTP.CONTENT_LEN); // If the content-Length of the reponse is not set correctly, the connection will not be reused. // Therefore, a response connection that is not properly set to Content-Length cannot be reusedif (clhs.length == 1) {
                    final Header clh = clhs[0];
                    try {
                        final int contentLen = Integer.parseInt(clh.getValue());
                        if (contentLen < 0) {
                            return false;
                        }
                    } catch (final NumberFormatException ex) {
                        return false; }}else {
                    return false; }}if (headerIterator.hasNext()) {
            try {
                final TokenIterator ti = new BasicTokenIterator(headerIterator);
                boolean keepalive = false;
                while(ti.hasNext()) {final String token = ti.nextToken(); // If response has a Connection:Close header, it is explicitly intended to be closedif (HTTP.CONN_CLOSE.equalsIgnoreCase(token)) {
                        return false; // If the response has a Connection: keep-alive header, it is explicitly persistent.else if (HTTP.CONN_KEEP_ALIVE.equalsIgnoreCase(token)) {
                        keepalive = true; }}if (keepalive) {
                    return true;
                }
            } catch (final ParseException px) {
                return false; }} // If there is no related Connection header in response, all multiplexing connections higher than HTTP/1.0 will occurreturn! ver.lessEquals(HttpVersion.HTTP_1_0);Copy the code

To sum up:

  • If the request header contains Connection:Close, the request is not multiplexed
  • If the content-Length of the response is not set correctly, it is not multiplexed
  • If the response header contains Connection:Close, it is not multiplexed
  • If reponse header contains Connection: keep-alive, reuse
  • If the HTTP version is higher than 1.0, reuse it

As you can see from the code, the implementation strategy is consistent with the constraints of the protocol layer in Chapters 2 and 3.

How does HttpClient clean up stale connections

Prior to HttpClient4.4, connections for reuse from the connection pool were checked for expiration and cleaned up when they expired.

In later versions, however, a separate thread scans the pool for connections and cleans up when it finds that the time since the last use has exceeded the set time. The default timeout is 2 seconds.

    public CloseableHttpClient build() {// The cleanup thread is started only if you specify that stale and idle connections are to be clearedif(evictExpiredConnections | | evictIdleConnections) {/ / create a connection pool cleaning thread final IdleConnectionEvictor connectionEvictor = new IdleConnectionEvictor(cm, maxIdleTime > 0 ? maxIdleTime : 10, maxIdleTimeUnit ! = null ? maxIdleTimeUnit : TimeUnit.SECONDS, maxIdleTime, maxIdleTimeUnit); closeablesCopy.add(newCloseable() { @Override public void close() throws IOException { connectionEvictor.shutdown(); try { connectionEvictor.awaitTermination(1L, TimeUnit.SECONDS); } catch (final InterruptedException interrupted) { Thread.currentThread().interrupt(); }}}); Connectionevictor.start (); }Copy the code

You can see that during the Build of HttpClientBuilder, if cleanup is enabled, a connection pool cleanup thread is created and run.

    public IdleConnectionEvictor(
            final HttpClientConnectionManager connectionManager,
            final ThreadFactory threadFactory,
            final long sleepTime, final TimeUnit sleepTimeUnit,
            final long maxIdleTime, final TimeUnit maxIdleTimeUnit) {
        this.connectionManager = Args.notNull(connectionManager, "Connection manager"); this.threadFactory = threadFactory ! = null ? threadFactory : new DefaultThreadFactory(); this.sleepTimeMs = sleepTimeUnit ! = null ? sleepTimeUnit.toMillis(sleepTime) : sleepTime; this.maxIdleTimeMs = maxIdleTimeUnit ! = null ? maxIdleTimeUnit.toMillis(maxIdleTime) : maxIdleTime; this.thread = this.threadFactory.newThread(newRunnable() {
            @Override
            public void run() {try {// in an infinite loop, the thread keeps executingwhile(! Thread.currentthread ().isinterrupted ()) {// Execute after resting for several seconds, default 10 seconds thread.sleep (sleepTimeMs); / / clean up overdue connection connectionManager. CloseExpiredConnections (); // If the maximum idle time is specified, idle connections are clearedif(maxIdleTimeMs > 0) { connectionManager.closeIdleConnections(maxIdleTimeMs, TimeUnit.MILLISECONDS); } } } catch (final Exception ex) { exception = ex; }}}); }Copy the code

To sum up:

  • Clearing expired and idle connections is enabled only if the HttpClientBuilder is manually set
  • Manually, will start a thread loop execution, each sleep time, call HttpClientConnectionManager method of cleaning a expired and free connection.

Vii. Summary of this paper

  • The HTTP protocol alleviates the problem of excessive connections in earlier designs by means of persistent connections
  • Persistent connections are available in two ways: HTTP/1.0+ keep-avLive and HTTP/1.1’s default persistent connections
  • HttpClient manages persistent connections through connection pools, which are divided into two pools: the total connection pool and the connection pool for each route
  • HttpClient gets a pooled connection via an asynchronous Future
  • The default Connection reuse policy is consistent with HTTP protocol constraints. According to response, Connection:Close is disabled, Connection: keep-alive is enabled, and the last version is greater than 1.0
  • Connections in the connection pool will be cleaned only if the clearing of expired and idle connections is manually enabled in HttpClientBuilder
  • Versions of HttpClient4.4 and later clean up stale and idle connections with a dead-loop thread that sleeps for a while each time it executes to achieve periodic execution

The above research is based on the HttpClient source personal understanding, if wrong, I hope you leave a positive message discussion.


Source: www.liangsonghua.me

Pay attention to wechat public number: songhua preserved egg bulletin board, get more exciting!

Introduction to our official account: We share our technical insights from working in JD, as well as JAVA technology and best practices in the industry, most of which are pragmatic, understandable and reproducible