Introduction to this Article

[1] The epidemic is ongoing

[2] Application exception monitoring

[3] Redis client exception analysis

[4] Redis client problem guide analysis

[5] Analysis from Redis client perspective

[6] From the perspective of Redis server

[7] Analysis on rationality of resource pool production allocation

[8] Summary of this paper

[1] The epidemic is ongoing

In response to the national call to fight the epidemic, more than 100 million employees of enterprises across the country have chosen remote working, and IT technology giants have also opened VPN mode to ensure business operation.

In that case, what should we do? Helpless and painful program ape people can only refuel stem! Here’s a picture of what bosses are worried about.

Sorry, but the biggest concern for bosses is not their employees’ physical and mental health, but their productivity

However, according to the estimates of industry insiders, COVID-19 is likely to stimulate the acceleration of information construction of domestic enterprises.

For enterprises, it is worth expecting that “telecommuting” allows enterprises to see more possibilities of office forms, and helps enterprises to try new office forms in the future.

More importantly, enterprises can take this as an opportunity to improve their own information construction, enhance team cohesion and coordination, and run smoothly in the “crisis”, or even find opportunities.

The author is no exception. This week, I have opened the remote working mode according to the requirements of the company. For my feeling this week, my work efficiency will definitely be affected. But because we had clear goals for the first quarter, we were able to go on schedule every day.

In addition, due to the impact of the epidemic, all the aunts have been kept at home. The service DAU of one end of our company has been on the rise recently, and the income of paid members has also increased a lot, which is good news for us.

Telecommuting at home, not to the country on the good 🤩!

Next, let’s talk about a problem with the online environment and the analysis process.

[2] Application exception monitoring

No, there is an exception on the Redis client in the project that appears in front of you during the epidemic. Although the exception is accidental, it is necessary to carefully analyze the cause of the exception.

The specific exception information is as follows:

When you look at the screenshot showing the exception information, are you wondering why this exception display is so “friendly”?

Yes, through Sentry, a very useful real-time exception monitoring tool that we have been using in our projects for a long time.

For example, for the occurrence of exceptions, the specific access to the entire URL, the information reported by the client, device model and other information as TAGS to collect, to show you, so that you can quickly locate the problem combined with these information.

This service is deployed in the K8S container environment. In the screenshot, TAGS can also be seen that server_name represents Pod hostname, so that it can quickly know which Pod has the problem and enter the container platform directly into Pod for further detailed analysis.

It is highly recommended that you include Sentry in your project because it not only has a very good exception management platform, but more importantly Sentry supports cross-language clients, such as Java, Andriod, C++, Python, Go and most other languages. Ready-made clients are easy to access and use.

As long as your service is not stuck, there will always be some ERROR level logs in the output of the project. Then Sentry will send you an alarm (email…). Let you know.

[3] Redis client exception analysis

Jedis (Java client of Redis) used in this project prompts an abnormal message JedisConnectionException Unexpected end of stream. I seldom encountered this problem when using Redis. Since I encountered this problem, This is not fate 🙂

In fact, the exception stack has been given a detailed call process, where the problem, follow the stack to find clues.

How do I find a more detailed stack? Don’t worry, clicking on RAW in the image above will bring up the entire exception stack text, which is easy to copy and analyze.

As follows:

redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
    at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
    at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
    at redis.clients.jedis.Protocol.process(Protocol.java:151)
    at redis.clients.jedis.Protocol.read(Protocol.java:215)
    at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
    at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:239)
    at redis.clients.jedis.BinaryJedis.auth(BinaryJedis.java:2139)
    at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:108)
    at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:888)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:432)
    at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:361)
...Copy the code

According to the above information, found that is invoked to BinaryJedis. The auth validation error when Redis password, and have GenericObjectPool borrowObject said borrow object method, GenericObjectPool is the thread pool of Apache open source projects, which can be found in many open source projects.

Genericobjectpool. create (jedisfactory. makeObject); genericObjectPool. create (jedisfactory. makeObject); genericObjectPool. create (jedisfactory. makeObject);

Oh? It is possible that the network is unstable and the redis-server does not normally return an exception message when verifying the password when creating a new object.

[4] Redis client problem guide analysis

We found thread pools in the exception stack above. What happens if you don’t use resource pools to manage these objects?

As shown below, each time you use the Redis connection, the Jedis object is recreated on the client side. After creating the Jedis object, the Redis Server is connected to the Jedis object. This process establishes the TCP connection (three-way handshake), and then disconnects the TCP connection (four-way wave). It consumes resources and fails to meet application performance requirements.

If thread pools were used, the following figure would look like this:

Initializes a certain number of objects in the resource pool as required. When a client request arrives, the system obtains objects from the resource pool. After the objects are used, the system throws the objects back to the resource pool for other clients to use.

This is called a “pooling technique” and will be used in your projects, such as database connection pooling, application server thread pooling, and so on.

Pooling has the advantage of being able to reuse objects in the pool, as shown above, without the overhead of allocating memory and creating objects in the heap. It can avoid the resource cost of TCP connection establishment and disconnection by avoiding object creation repeatedly. Reducing the burden on the garbage collector by avoiding the overhead of freeing memory and destroying objects in the heap; Avoid memory jitter and repeatedly initialize object states.

Of course, we can also implement it ourselves, but if you want to write a more perfect object pool resource management function, also need to spend a lot of energy, consider the details is very much.

Standing on the shoulders of giants, Jedis is implemented internally by the Apache Common Pool2 open source toolkit and is widely used in many open source projects.

Many of the parameters on the Jedis client are derived from the underlying implementation of Apache Common Pool2.

This is also the reason why Jedis or some Redis clients are easy to use for users. But at the same time, we also need to properly configure the connection pool parameters according to different scenarios. Unreasonable configuration and unreasonable function use may cause a lot of problems.

Returning to the original exceptions, what are they related to?

From the diagram, we can see that the client uses a thread pool, probably related to the thread pool; When creating object, auth failed to verify the password, and the password has already initiated the connect connection, indicating that the connection is connected to Redis Server, so the Redis Server can not be removed from the connection.

Related to Redis Client, we need to further analyze whether the parameters of the Client and the connection pool are reasonable.

If it is related to Redis Server, it is necessary to analyze the parameters of the Server based on the problem and determine whether the relevant configuration parameters are reasonable.

[5] Analysis from Redis client perspective

Now that we are talking about the Redis client, the first thing that comes to mind is the parameters of the client configuration.

Directly from the parameters of the start, how about we can start with the exception stack analysis, from the object resource pool to analyze, see how the object pool is managed?

1. Manage resource pool objects

The procedure for creating an object in a resource pool is shown in the figure above.

Apache Common Pool2 is a universal resource pool management framework. It internally defines the interface and specification of the resource pool. The specific object creation is implemented by the specific framework.

1) to get an object from the resource pool, ObjectPool#borrowObject is called. If there are no idle objects, PooledObjectFactory#makeObject is called to create the object. JedisFactory is the concrete implementation class.

2) Add the created object to the resource pool and return it to the client.

3) ObjectPool#returnObject is called after the object is used, which internally verifies that some conditions are met. The object is returned to the resource pool.

4) if a condition fails, such as the resource pool is closed, the object is in an incorrect state (the Jedis connection failed), or the maximum number of free resources has been exceeded, PooledObjectFactory#destoryObject is called to destroy the object from the resource pool.

ObjectPool and KeyedObjectPool are the two base interfaces. By default, GenericObjectPool is implemented. The KeyedObjectPool interface maintains objects in key-value pairs. The default implementation class is GenericKeyedObjectPool. In the implementation process there will be a lot of common functionality implementation, put in the BaseGenericObjectPool base implementation class.

SoftReferenceObjectPool is a special implementation in which each object is wrapped into a SoftReference. SoftReference SoftReference allows the garbage collection mechanism to reclaim objects in the object pool when memory is insufficient during the JVM GC, avoiding memory leakage

PooledObject is the interface definition for pooled objects, where pooled objects are encapsulated. DefaultPooledObject is the default implementation class of the PooledObject interface. PooledSoftReference uses SoftReference to encapsulate an object for SoftReferenceObjectPool.

2. Object pool parameters

One way to view object pool parameters is to directly look up the code or the documentation on the official website. Another way is more intuitive. Because the Common Pool2 tool is connected to JMX, you can use tools such as Jconsole to view exposed attributes and operations.

The first way:

Find the corresponding configuration class:

The setter methods provided by the GenericObjectPoolConfig and BaseObjectPoolConfig configuration classes are configuration parameters, and are commented in detail in the code.

The second way:

The premise is that your application exposes the JMX port and IP to allow external connections.

JVM parameters are as follows:

- Dcom. Sun. Management. Jmxremote - Djava. Rmi. The server hostname = IP address - Dcom. Sun. Management jmxremote. Port = port -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=falseCopy the code

Pool2 #GenericObjectPool#pool2 It also includes some statistical properties of the resource pool.

Core configuration attributes:

These are the focused properties and configurable parameters that are provided externally.

1) minIdle Ensure the minimum number of idle connections in the resource pool. Default value: 0

2) maxIdle Maximum number of idle connections in the resource pool. Default value: 8

3) maxTotal Maximum number of connections in the resource pool. Default value: 8

MaxWaitMillis Indicates the maximum waiting time of the caller when the connections in the resource pool are exhausted, in milliseconds. The default value is -1. A proper value is recommended

5) Whether testOnBorrow checks the connection validity when borrowing a connection from the resource pool. Invalid connections will be removed. The default value is false

6) testOnCreate Specifies whether to perform connection validity checks after a new resource connection is created. Invalid connections will be removed. The default value is false

7) Whether to perform connection validity check when testOnReturn returns a connection to the resource pool. Invalid connections are removed. Default value: false

TestWhileIdle Specifies whether to enable idle resource monitoring. Default: false

Whether the caller should wait when the pool is exhausted. Default value: true. The maxWaitMillis parameter takes effect only when it is true. You are advised to use the default value

10) How to place pool objects In a resource pool: lifo Last In First Out, true (default), indicates to place pool objects In the First free queue, false: indicates to place pool objects In the Last free queue

Idle resources monitor configuration properties

The testWhileIdle parameter is enabled and combined with the following parameters to monitor idle resources.

1) timeBetweenEvictionRunsMillis idle resources testing cycle, milliseconds. The default value: 1, said don’t test, suggest to set up a reasonable value, the periodic operation monitoring task

2) minEvictableIdleTimeMillis pool resources minimum idle time, milliseconds, default: 30 minutes 60 l, 30 l (1000), when idle resources will be removed after hitting the value set Suggestions according to the business itself

NumTestsPerEvictionRun Indicates the number of samples sampled at a time when idle resource detection is performed. The default value is 3, which can be fine-adjusted based on the number of connections. If the value is set to -1, idle resource detection is performed for all connections

3, idle resource monitoring source code analysis

After the resource pool is initialized, an idle resource monitoring task flows as follows:

Corresponding source code:

Initialize matches and tasks in the constructor when creating a resource pool object.

this.internalPool = new GenericObjectPool<T>(factory, poolConfig);Copy the code

public GenericObjectPool(final PooledObjectFactory<T> factory, final GenericObjectPoolConfig config) { super(config, ONAME_BASE, config.getJmxNamePrefix()); if (factory == null) { jmxUnregister(); // tidy up throw new IllegalArgumentException("factory may not be null"); } this.factory = factory; IdleObjects = new LinkedBlockingDeque<PooledObject<T>>(config.getfairness ()); // Initialize setConfig(config); / / open resource monitoring task startEvictor (getTimeBetweenEvictionRunsMillis ()); } final void startEvictor(final long delay) {synchronized (evictionLock) {if (null! = evictor) { EvictionTimer.cancel(evictor, evictorShutdownTimeoutMillis, TimeUnit.MILLISECONDS); evictor = null; evictionIterator = null; } if (delay > 0) {evictor = new evictor (); // Enable evictiontimer.schedule (evictor, delay, delay); }}}Copy the code

Is a TimerTask Eviector, by enabling the scheduler, intervals timeBetweenEvictionRunsMillis run once.

class Evictor extends TimerTask { @Override public void run() { final ClassLoader savedClassLoader = Thread.currentThread().getContextClassLoader(); try { ... // Evict from the pool evict(); // Ensure min idle num ensureMinIdle(); } finally { // Restore the previous CCL Thread.currentThread().setContextClassLoader(savedClassLoader); }}}Copy the code

Evict () remove method

@Override public void evict() throws Exception { assertOpen(); if (idleObjects.size() > 0) { PooledObject<T> underTest = null; Final EvictionPolicy<T> EvictionPolicy = getEvictionPolicy(); synchronized (evictionLock) { final EvictionConfig evictionConfig = new EvictionConfig( getMinEvictableIdleTimeMillis(),  getSoftMinEvictableIdleTimeMillis(), getMinIdle()); final boolean testWhileIdle = getTestWhileIdle(); for (int i = 0, m = getNumTests(); i < m; i++) { // ... // underTest represents each resource Boolean evict; evict = evictionPolicy.evict(evictionConfig, underTest, idleObjects.size()); If (evict) {destroy(underTest); // If (evict) {destroy(underTest); destroyedByEvictorCount.incrementAndGet(); If (testWhileIdle) {Boolean active = false; try { factory.activateObject(underTest); active = true; } catch (final Exception e) { destroy(underTest); destroyedByEvictorCount.incrementAndGet(); } if (active) { if (! factory.validateObject(underTest)) { destroy(underTest); destroyedByEvictorCount.incrementAndGet(); } else { try { factory.passivateObject(underTest); } catch (final Exception e) { destroy(underTest); destroyedByEvictorCount.incrementAndGet(); }}}} //... }}}} //... }Copy the code

Code in the default policy evictionPolicy by org.apache.com mons. Pool2. Impl. DefaultEvictionPolicy provide default implementation.

// DefaultEvictionPolicy#evict()
@Override
public boolean evict(final EvictionConfig config, final PooledObject<T> underTest,
                final int idleCount) {
  
if ((config.getIdleSoftEvictTime() < underTest.getIdleTimeMillis() &&
                config.getMinIdle() < idleCount) ||
                config.getIdleEvictTime() < underTest.getIdleTimeMillis()) {
        return true;
}
return false;
}Copy the code

1) Return true when the size of the idle resource list exceeds the minIdle minimum number of idle resources and the idleSoftEvictTime is smaller than the idle time of the resource.

When the EvictionConfig configuration is initialized, idleSoftEvictTime is assigned the value long.max_value if the default value -1 < 0 is used.

2) Returns true when the detected idle time of the resource expires, that is, exceeds the minimum idle time configured for the resource pool. Indicates that these resources are idle and have not been used in the period.

If either condition is met, the resource object is destroyed.

EnsureIdle () method

private void ensureIdle(final int idleCount, final boolean always) throws Exception { if (idleCount < 1 || isClosed() || (! always && ! idleObjects.hasTakeWaiters())) { return; While (idleobjects.size () < idleCount) {final PooledObject<T> p = create(); final PooledObject<T> p = create(); if (p == null) { // Can't create objects, no reason to think another call to // create will work. Give up. break; } if (getLifo()) { idleObjects.addFirst(p); } else { idleObjects.addLast(p); } } if (isClosed()) { // Pool closed while object was being added to idle objects. // Make sure the returned object is destroyed rather than left // in the idle object pool (which would effectively be a leak) clear(); }}Copy the code

That is the analysis of the basic principles and parameters of thread pools.

Thread pool object status

The state of the thread pool object is defined in PooledObjectState, which is an enumerated type with the following values:

IDLE Indicates the IDLE state

ALLOCATED tokens are in use

EVICTION is being validated by Evictor expeller

VALIDATION is being validated.

INVALID Expulsion test or validation failed and will be destroyed

‘ABANDONED’ : a state in which an object is removed and not returned for a long time

RETURNING to the object pool

A diagram to understand thread pool state machine transitions:

5. Object pool initialization time

Consider a question: when are objects in a resource pool initialized? The resource pool here refers to the idleObjects cache list of idle resource objects in the figure above. Is it when you create an object or when you return it?

The answer is when the object is returned.

In some scenarios, timeouts may occur after startup because each request creates a new resource, which has some overhead.

After the application is started, we can preheat the thread pool resources in advance. Example code is as follows:

List<Jedis> minIdleList = new ArrayList<Jedis>(jedisPoolConfig.getMinIdle());

for (int i = 0; i < jedisPoolConfig.getMinIdle(); i++) {
    Jedis jedis = null;
    try {
        jedis = pool.getResource();
        minIdleList.add(jedis);
        jedis.ping();
    } catch (Exception e) {
        logger.error(e.getMessage(), e);
    } finally {
    }
}

for (int i = 0; i < jedisPoolConfig.getMinIdle(); i++) {
    Jedis jedis = null;
    try {
        jedis = minIdleList.get(i);
        jedis.close();
    } catch (Exception e) {
        logger.error(e.getMessage(), e);
    } finally {
    }
}Copy the code

Jedis.close () = jedis.close(); Shut down the resource.

Take a look at the source code of the thread pool resource return object.

GenericObjectPool#returnObject() returns an object.

Public void returnObject(final T obj) {// allObjects is the place where all object resources are stored final PooledObject<T> p = allObjects.get(new IdentityWrapper<T>(obj)); / /... Synchronized (p) {final PooledObjectState = p.gett state (); synchronized(p) {final PooledObjectState = p.gett state (); if (state ! = PooledObjectState.ALLOCATED) { throw new IllegalStateException( "Object has already been returned to this pool or is invalid"); } p.markReturning(); // Keep from being marked abandoned } final long activeTime = p.getActiveTimeMillis(); If (getTestOnReturn()) {if (! factory.validateObject(p)) { try { destroy(p); } catch (final Exception e) { swallowException(e); } try { ensureIdle(1, false); } catch (final Exception e) { swallowException(e); } updateStatsReturn(activeTime); return; }} / /... if (! p.deallocate()) { throw new IllegalStateException( "Object has already been returned to this pool or is invalid"); } final int maxIdleSave = getMaxIdle(); if (isClosed() || maxIdleSave > -1 && maxIdleSave <= idleObjects.size()) { try { destroy(p); } catch (final Exception e) { swallowException(e); If (getLifo()) {idleObjects.addfirst (p); if (idleObjects.addfirst (p); } else { idleObjects.addLast(p); } if (isClosed()) { // Pool closed while object was being added to idle objects. // Make sure the returned object is destroyed rather than left // in the idle object pool (which would effectively be a leak) clear(); } } updateStatsReturn(activeTime); }Copy the code

ALLOCATED object is first returned from damaged to damaged, testOnReturn is true to check the validity of the resource (Jedis connection validity), if not, destroy() will be called. When maxIdle does not exceed the idleObjects resource list size, the returned objects are added to idleObjects.

The borrorObject() method borrows an object from idleObjects#pollFirst() and creates it if it doesn’t have one. The maximum number of objects can’t exceed maxTotal.

Jedis client thread pool parameters

Now that we’ve looked at the thread pool principles of the Apache Common Pool2 framework, let’s look at how it’s wrapped in Jedis.

The parameters in the thread pool are built based on JedisPoolConfig.

JedisPoolConfig Jedis resource pool configuration class default constructor:

public class JedisPoolConfig extends GenericObjectPoolConfig { public JedisPoolConfig() { // defaults to make your life with connection pool easier :) setTestWhileIdle(true); setMinEvictableIdleTimeMillis(60000); setTimeBetweenEvictionRunsMillis(30000); setNumTestsPerEvictionRun(-1); }}Copy the code

TestWhileIdle JedisPoolConfig inherits GenericObjectPoolConfig. The default JedisPoolConfig constructor sets testWhileIdle to true (the default is false). MinEvictableIdleTimeMillis is set to 60 seconds (the default is 30 minutes), timeBetweenEvictionRunsMillis set to 30 seconds (defaults to 1), numTestsPerEvictionRun set to 1 (the default is 3).

Idle resources are monitored every 30 seconds. If idle resources are not used for more than 60 seconds, they are removed from the resource pool.

After creating the JedisPoolConfig object, set some parameters:

/ / create JedisPoolConfig object, set parameters JedisPoolConfig JedisPoolConfig = new JedisPoolConfig () JedisPoolConfig. SetMaxTotal (100); jedisPoolConfig.setMaxIdle(60); jedisPoolConfig.setMaxWaitMillis(1000); jedisPoolConfig.setTestOnBorrow(false); jedisPoolConfig.setTestOnReturn(true);Copy the code

JedisPool manages the Jedis thread pool:

Public JedisPool(final GenericObjectPoolConfig poolConfig, final String host, int port, int timeout, final String password) { this(poolConfig, host, port, timeout, password, Protocol.DEFAULT_DATABASE, null); } public abstract class Pool<T> implements Closeable { protected GenericObjectPool<T> internalPool; Public Pool(final GenericObjectPoolConfig poolConfig, PooledObjectFactory<T> factory) { initPool(poolConfig, factory); }}Copy the code

[6] From the perspective of Redis server

Since the guess may be related to the Redis server, we need to analyze whether there will be an impact from the parameter configuration of the client.

1. Redis client buffer is full

Redis has three types of client buffers:

Normal client buffer (normal) :

It is used to accept common commands, such as get, set, mset, and hgetall

Slave Client buffer (slave) :

Write command used to synchronize the master node to complete the replication.

Publish and subscribe buffer (PUBsub) :

Pubsub is not a normal command and therefore has a separate buffer.

Redis client buffer configuration format is:

client-output-buffer-limit <class> <hard limit> <soft limit> <soft seconds>Copy the code

(1) Class: client type: Normal, slave, pubSub

(2) Hard limit: If the output buffer used by the client is larger than the hard limit, the client will be shut down immediately.

(3) Soft limit and soft seconds: If the output buffer used by the client exceeds soft limit and lasts soft limit seconds, the client will be shut down immediately

Redis client-output-buffer-limit

127.0.0.1:6379> config get client-output-buffer-limit 1) "client-output-buffer-limit" 2) "normal 0 0 0 slave 21474836480  16106127360 60 pubsub 33554432 8388608 60"Copy the code

The class, hard limit, and soft limit values of the normal client buffer are 0, indicating that the buffer limit is disabled.

If the buffer period is too small, Unexpected end of stream exceptions may occur.

2. The timeout setting of the Redis server is incorrect

The Redis server disconnects the idle connection after the timeout period.

View the timeout configuration of the server:

127.0.0.1:6379> config get timeout
1) "timeout"
2) "600"Copy the code

Timeout is set to 600 seconds. If the same connection is idle for 10 minutes and is not used, Redis interrupts the connection.

The problem is that the timeout is related to the idle resource monitoring tasks in the Jedis thread pool above.

Assuming the JedisPoolConfig timeBetweenEvictionRunsMillis is not set, will use the default value is 1, not started Evictor free monitoring task.

When a Jedis connection is lent from the pool, note that the Redis server has broken the connection after 10 minutes.

Unexpected end of stream exceptions occur when the client is still holding the Jedis connection to continue operations on set, get, etc.

Example demonstration:

To facilitate the demonstration, the following parameters are adjusted.

1) Redis server timeout is initialized to 10 seconds

2) The Java test code is shown below

new Thread(new Runnable() {
        public void run() {
                for (int i = 0; i < 5; i++) {
                        System.out.println(" jedis.get(\"foo\"): " +  jedis.get("foo"));
                        try {
                                Thread.sleep(12000);
                        } catch (InterruptedException e) {
                                e.printStackTrace();
                        }
                }
        }
}).start();Copy the code

Output result:

Jedis.get ("foo"): Jedis.get ("foo"), Anomaly appeared the Exception in the thread "thread - 58" redis. Clients. Jedis. Exceptions. JedisConnectionException: Unexpected end of stream. at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199) at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) at redis.clients.jedis.Protocol.process(Protocol.java:151) at redis.clients.jedis.Protocol.read(Protocol.java:215) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340) at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:259) at redis.clients.jedis.Connection.getBulkReply(Connection.java:248) at redis.clients.jedis.Jedis.get(Jedis.java:153)Copy the code

So JedisPoolConfig default constructor, launched the Evictor task directly, the client thread pool itself to monitoring the idle connections and found that more than a minEvictableIdleTimeMillis setting time, removed from the resource pool.

Avoid the client to obtain the connection, but cannot use it normally, resulting in some exceptions.

Whether the timeout value in Redis server is reasonable depends on its own business scenarios.

It is said that timeout is set to 0 in Aliyun Redis (not used in the company), that is, idle connections will not be actively closed; The buffer is set to 0, 0, 0, which means there is no limitation on the client buffer, which is generally fine.

3. Network instability factors

Return to the JedisConnectionException stack information for the Sentry alarm mentioned at the beginning of this article.

Review the exception stack as follows:

redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
    at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
    at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
    at redis.clients.jedis.Protocol.process(Protocol.java:151)
    at redis.clients.jedis.Protocol.read(Protocol.java:215)
    at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
    at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:239)
    at redis.clients.jedis.BinaryJedis.auth(BinaryJedis.java:2139)
    at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:108)
    atCopy the code

Unexpected end of stream thrown when auth after CONNECT validates password while creating new resource connection

After the above detailed analysis, after excluding the Redis client buffer full and timeout parameter setting rationality, the rest may be related to network factors. Previously, this problem did not occur in applications deployed on virtual machines or physical machines outside the container. Now, it occasionally occurs in k8S containers. It is necessary for O&M to be familiar with the container’s network erection, and use subcontracting tools to troubleshoot network problems and further clarify the cause.

According to the final analysis results, the occasional problems such as network jitter can be solved by adding a retry mechanism on the client.

In addition, we also did several tests on the Redis cluster in the K8S container and did not find any performance problems.

[7] Analysis on rationality of resource pool production allocation

If you are not sure whether the Jedis thread pool parameters are properly set, you can configure some core parameters and check them out online with the JMX tool.

Look again at the JMX tool to view the properties:

CreatedCount: number of created resource pool objects

DestoryedCount: indicates the total number of resource pool objects destroyed

DestoryedByEvictorCount: Number of resource pool objects destroyed by Evictor idle monitoring task

BorrowedCount: The number of times objects are lent from a resource pool

ReturnedCount: Number of times to return a resource pool object

CreatedCount is 6393 and DestoryedByEvictorCount is 6381, indicating that most objects are destroyed by the idle resource monitoring Evictor task shortly after they are created.

According to the Evictor configuration parameter in the previous article, “The task is executed every 30 seconds. If the object in the pool is not used for more than 60 seconds, the object is destroyed.”

The Redis server timeout is 10 minutes. If we do not want objects to be destroyed so quickly, we should try to keep them in the resource pool to reduce the overhead of creating new connections and optimize the parameters of idle monitoring tasks.

Parameter optimization example:

// defaults to make your life with connection pool easier :) jedisPoolConfig.setTestWhileIdle(true); / / increase the minimum free time, much in the resource pool jedisPoolConfig retained for a period of time. The setMinEvictableIdleTimeMillis (180000); / / detection task execution time cycle jedisPoolConfig setTimeBetweenEvictionRunsMillis (30000); / / detection task execution time, every time the number of sampling, such as setting up to 5 jedisPoolConfig. SetNumTestsPerEvictionRun (1);Copy the code

According to the parameter analysis, it is obvious that maxIdle set to 60 and maxTotal set to 100 are too large. Adjust the value appropriately.

jedisPoolConfig.setMaxTotal(30);
jedisPoolConfig.setMaxIdle(10);
jedisPoolConfig.setMinIdle(5);
jedisPoolConfig.setMaxWaitMillis(1000);Copy the code

In addition, according to the expulsion of strategy in the idle resources detection task analysis, can use softMinEvictableIdleTimeMillis minIdle and use a combination of two parameters, Such as softMinEvictableIdleTimeMillis is set to 180 seconds, minIdle set to 5, when resources are idle for more than 180 seconds, and idleObjects free list size exceeded minIdle minimum number of idle resources, The resource is removed from the pool.

This ensures that a certain number of minIdle resource connections exist in the resource pool, preventing frequent creation of new resource connections.

You can also search the Jedis Github ISSUE to see if there is an answer when you encounter some exceptions.

The main analysis ideas of this paper are also from ISSUE#932 and ISSUE#1092. However, each person has different problems and different solutions.

For example, ISSUE#1092 responded with the following answer:

Set timeout to 0 on the Redis server to prevent Redis from actively disconnecting, and then set maxIdle on the client to 0.

If maxIdle is 0, the resource pool is not fully utilized. Each request will create a new resource connection, which will be destroyed immediately after return.

However, the reason for his change is correct, that is, a connection is disconnected by Redis, and the client is still in use there, so there is no problem.

[8] Summary of this paper

This article is led by an exception on the Redis Java client, and the whole process from the monitored exception stack is analyzed in detail.

From the perspective of Jedis client, this paper analyzes the basic principles of Apache Common Pool2 thread pool, including object creation, object destruction and idle resource monitoring task mechanism.

Because of the configuration parameters used by the thread pool, through tools or source code analysis JedisPool thread pool parameters reasonable Settings.

From the perspective of Redis server side, this paper analyzes whether the setting of client buffer parameters and timeout parameters on Redis server side is reasonable and what circumstances may lead to Unexpected end of stream exception.

Through this article to understand the Redis client generated exceptions, with Redis client and server are related to the client tools (framework) to understand the basic principle, in order to better deal with all kinds of exceptions, find the root of the problem.

Performance problems in most applications can sometimes be tuned using parameters, but only if you have a thorough analysis of these parameter configurations and the rationale behind them before you dare to tune them.

The Unexpected end of Stream exception is only mentioned in this article. In addition to this exception, other exceptions thrown by the Jedis client are also analyzed in this article.

Here is a summary of some common Jedis exceptions:

1) blockWhenExhausted = true If maxWaitMillis is still unavailable, throw:

Caused by: java.util.NoSuchElementException: Timeout waiting for idle objectCopy the code

2) blockWhenExhausted = false Throw when no connection is available

Caused by: java.util.NoSuchElementException: Pool exhaustedCopy the code

Generally check whether Redis slow query blocking exists; MaxWaitMillis setting is too short;

3) Redis cannot connect, the connection will be rejected, will be thrown

Caused by: java.net.ConnectException: Connection refusedCopy the code

Generally check the correct Redis domain name configuration; Check whether the network is abnormal during this period.

4) If the client read/write timeout occurs, the system will throw the

JedisConnectionException: java.net.SocketTimeoutException: Read timed outCopy the code

5) The connection times out

JedisConnectionException: java.net.SocketTimeoutException: connect timed outCopy the code

4), 5) Consider the read and write timeout setting is too short; Slow queries or Redis blocking; Network instability direction to analyze.

6) Incorrect use of pipeline, will throw

JedisDataException: Please close pipeline or multi block before calling this method.Copy the code

According to the pipeline best practice to use, such as batch on the analysis of the results and recommend the use of pipeline, syncAndReturnAll ().

Other anomalies, you see what you can do.

At the end of the article, the code word is not easy, if there are omissions, but also please correct, I hope to help you, thank you.

References:

https://yq.aliyun.com/articles/236384?spm=a2c4e.11155435.0.0.e21e2612uQAVoW#cc1

https://github.com/xetorthio/jedis/issues/932

https://github.com/xetorthio/jedis/issues/1029

https://www.cnblogs.com/benthal/p/10761868.html

Welcome to pay attention to my public number, scan the TWO-DIMENSIONAL code to pay attention to unlock more wonderful articles, grow up with you ~