The target

  • Slow consumption of ActiveMQ in production environment
  • Service thread blocking troubleshooting after the first code optimization
  • Final problem resolution

Slow consumption of ActiveMQ in production environment

The production environment of a system of the company uses ActiveMQ for communication. Due to the particularity of the upper and lower systems, P2P mode is used for message docking. The sending service needs to connect with hundreds of ActiveMQ message queues, and each instance of the lower level service needs to connect with a message queue, and the message volume is not large. So the message producer is a single-threaded program, and the producer sends messages synchronously, meaning that it sends data only when the message reaches the broker and is flushed successfully.

Suddenly at a certain time, the backlog of data to be processed is generated, and as time goes by, the backlog is increasing. In fact, the message volume of the system is always increasing. The network environment of production environment is relatively special, and the delay is large. Therefore, some colleagues came up with the first solution to the problem of large data volume (this solution is actually a pit in itself), that is to change the single-thread sending message to multi-thread, notice that it is only changed to multi-thread sending, which also fosters the subsequent pit.

Service thread blocking troubleshooting after the first code optimization

After the problem occurred, my colleague quickly came up with the first solution, that is, multi-threaded sending. In fact, this was a pit, because I did not have enough knowledge of ActiveMQ and did not know much about JMSTemplate provided by Spring, I mistakenly thought that parallel sending could solve the problem, and soon the code was changed. Performance test run is no problem, so the program online, the problem then happened, the program ran for a few minutes, stuck, data processing can not go on, after a while found that it is not completely stuck, because the processing is too slow, each period of time or data processing. For multithreaded send ActiveMQ stress may not be able to handle more slowly relative to the single thread pressure big, processing speed is slower, advice immediately back process at the same time, I give advice to the virtual machine stack log print it out, and see what specific process is stuck in place, here is my for thread stack log analysis.

  • Thread stack log:
"mySend-83" #509 prio=5 os_prio=0 tid=0x00007fb480048000 nid=0x35b2 waiting on condition [0x00007fb409566000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f9f9b008> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) at org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:48) at org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:87) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1382) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1319) at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1967) - locked <0x00000000879b5d48> (a java.lang.Object) at org.apache.activemq.ActiveMQMessageProducer.send$original$UKNtu2e7(ActiveMQMessageProducer.java:288) at org.apache.activemq.ActiveMQMessageProducer.send$original$UKNtu2e7$accessor$V5Iy6ePf(ActiveMQMessageProducer.java) at org.apache.activemq.ActiveMQMessageProducer$auxiliary$ZkuXgd8X.call(Unknown Source) at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:93) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java) at org.apache.activemq.ActiveMQMessageProducer.send$original$UKNtu2e7(ActiveMQMessageProducer.java:223) at org.apache.activemq.ActiveMQMessageProducer.send$original$UKNtu2e7$accessor$V5Iy6ePf(ActiveMQMessageProducer.java) at org.apache.activemq.ActiveMQMessageProducer$auxiliary$Rh0cug33.call(Unknown Source) at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:93) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java) at org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269) at org.springframework.jms.connection.CachedMessageProducer.send(CachedMessageProducer.java:181) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:626) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:597) at org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:574) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:484) at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:570) at com.ruubypay.miss.obpsc.db.service.impl.SendMsgServiceImpl.processDataOne(SendMsgServiceImpl.java:47) at com.ruubypay.miss.obpsc.db.service.impl.ObpsCBlacklistChangeServiceImpl$2.run(ObpsCBlacklistChangeServiceImpl.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)  Locked ownable synchronizers: - <0x000000008752f948> (a java.util.concurrent.ThreadPoolExecutor$Worker)Copy the code
  • Thread log Analysis

According to the above log can clearly find the card to the com. Ruubypay. Miss. Obpsc. Db. Service. The impl. SendMsgServiceImpl. ProcessDataOne line 47, this line of code is sending a message, and the code is as follows:

jmsQueueTemplate.send(queueName, new MessageCreator() { @Override public Message createMessage(Session session) throws JMSException { TextMessage textMessage = session.createTextMessage(msg); textMessage.setStringProperty("changeTimestamp", timestamp); return textMessage; }});Copy the code

According to the above log, the core of thread stuck is further located. It can be seen from syncSendPacket that data is sent in synchronous mode. According to the org, apache activemq. Transport. FutureResponse. GetResult (FutureResponse. Java: 48) this log can be seen in FutureResponse line 48 stuck, Responseslot.take () is an ArrayBlockingQueue that gets the result of a successful message processing, so we know what’s going on here. Because synchronous sending needs to wait for the result of flushing, a blocking queue is used to store the consumption result. The sending thread has been taking () to send the result, and if there is no result, it will be blocked. If the program is located, it will never get the result, so it will be blocked here.

public Response getResult() throws IOException { boolean hasInterruptPending = Thread.interrupted(); try { return responseSlot.take(); } catch (InterruptedException e) { hasInterruptPending = false; throw dealWithInterrupt(e); } finally { if (hasInterruptPending) { Thread.currentThread().interrupt(); }}}Copy the code
  • Troubleshoot problems

Based on the above analysis, it can be determined that the cause is blocking waiting for consumption results, but why the blocking time is so long, but the network delay is not so slow. Therefore, I started to compare the differences of ActiveMQ in the performance test environment, and finally found that the persistence mode of MQ in the two environments is different, MySql used in the production environment, LevelDB was used in the performance test. We changed the persistence of ActiveMQ to Mysql in the performance test, but there was still no problem, which was very strange. So we focused on the operation of Mysql in the production environment, and finally found the problem from the Mysql execution log. All SQL statements executed by Mysql are very slow and take up to ten or even tens of seconds, so we locate the problem of Mysql.

Final problem resolution

Finally, the way to switch Mysql is developed to solve the problem.