preface

Interface performance optimization is certainly familiar to those of you who are engaged in back-end development, as it is a common problem independent of the development language.

The problem is as complicated as it is simple.

Sometimes, just adding an index can solve the problem.

Sometimes, you need to do code refactoring.

Sometimes, you need to increase the cache.

Sometimes, some middleware, such as MQ, needs to be introduced.

Sometimes, the need to separate the database and table.

Sometimes, services need to be split.

Wait…

The causes of interface performance problems can vary from project to project and interface to interface.

This article I summarized some effective, optimize the interface performance method, to have a need for a friend reference.

Index of 1.

Interface performance optimization The first thing that comes to mind is probably optimizing indexes.

Yes, the cost of optimizing indexes is minimal.

You can check online logs or monitoring reports to find that a certain SQL statement used by a certain interface takes a long time.

You may have the following questions:

  1. Has the SQL statement been indexed?
  2. Did the added index take effect?
  3. Mysql > select index (‘ index ‘, ‘index’);

Recently, I accidentally got a copy of the notes written by a big boss of BAT factory. All of a sudden, I got through to both my supervisor and my supervisor. I felt that the algorithm was not as difficult as I imagined.

BAT boss wrote the brush notes, let me get the offer soft

1.1 Not indexed

The key field of a SQL statement where condition, or the sort field after order by, is not indexed. This is a common problem in projects.

At the beginning of the project, because of the small amount of data in the table, there was little difference in SQL query performance with or without indexes.

Later, as the business grew, the amount of data in the table grew, and indexes had to be added.

You can run the following command:

show index from `order`;
Copy the code

The index of a table can be viewed separately.

You can also run the following command:

show create table `order`;
Copy the code

View the entire table construction sentence, which will also show the index condition.

You can add indexes with the ALTER TABLE command:

ALTER TABLE `order` ADD INDEX idx_name (name);
Copy the code

You can also add an INDEX by using the CREATE INDEX command:

CREATE INDEX idx_name ON `order` (name);
Copy the code

There is one caveat: you cannot change indexes by command.

If you want to change an index in mysql, you can only delete the index and then add a new index.

To DROP an INDEX, run the DROP INDEX command:

ALTER TABLE `order` DROP INDEX idx_name;
Copy the code

You can also use the DROP INDEX command:

DROP INDEX idx_name ON `order`;
Copy the code

1.2 Index Does not take effect

We have been able to confirm that the index exists by using the command above, but does it work? The question may come to mind at this point.

So, how do you check if an index is working?

A: You can view the mysql execution plan using the Explain command, which shows the index usage.

Such as:

explain select * from `order` where code='002';
Copy the code

Results:These columns can be used to determine the index usage. The following figure shows the meanings of the columns in the execution plan:If you want to see more details on the use of explain, check out my other article,Explain | index to optimize the best sword, you can really use?”

To be honest, the SQL statement did not go to the index, excluding no index, most likely index failure.

Here are some common reasons for index failure:If not, you need to look for other reasons.

1.3 Selecting the Wrong index

In addition, have you ever encountered a situation where it is the same SQL but the input parameters are different? Sometimes you go to index A, sometimes you go to index B?

Yes, sometimes mysql picks the wrong index.

If necessary, you can use force Index to force the query SQL to follow an index.

Mysql > select index (‘ index ‘, ‘index’);

2. SQL optimization

If you optimize the index, it doesn’t matter.

Next, try to optimize the SQL statement, because it is also much cheaper to modify than Java code.

Here are 15 tips for SQL optimization:Since these techniques have been covered in detail in my previous articles, I won’t go into them here.

For more details, you can read my article “15 Tips for SQL optimization”, I believe you will have a lot of harvest after reading.

3. Remote invocation

Many times, we need to call the interfaces of other services from one interface.

For example, here is a business scenario:

The user information query interface returns the following information: user name, gender, level, profile picture, score, and growth value.

The user name, gender, grade and profile picture are in the user service, the integral is in the integral service, and the growth value is in the growth value service. In order to summarize this data and return it uniformly, an external interface service needs to be provided.

Therefore, the user information query interface needs to call the user query interface, integral query interface and growth value query interface, and then summarize the data to return uniformly.

The call process is as follows:The total time of invoking the remote interface is 530ms = 200ms + 150ms + 180ms

Obviously, the performance of this serial invocation of the remote interface is very poor, and the total invocation time of the remote interface is the sum of all the remote interface time.

So how do you optimize remote interface performance?

3.1 Parallel Invocation

Since serial calls to multiple remote interfaces are bad, why not do them in parallel?

As shown below:Total time of remote interface invocation 200ms = 200ms (that is, the longest remote interface invocation)

Prior to java8, it was possible to get thread returns by implementing the Callable interface.

Java8 later implements this functionality through the CompleteFuture class. Here we use CompleteFuture as an example:

public UserInfo getUserInfo(Long id) throws InterruptedException, ExecutionException {
    final UserInfo userInfo = new UserInfo();
    CompletableFuture userFuture = CompletableFuture.supplyAsync(() -> {
        getRemoteUserAndFill(id, userInfo);
        return Boolean.TRUE;
    }, executor);

    CompletableFuture bonusFuture = CompletableFuture.supplyAsync(() -> {
        getRemoteBonusAndFill(id, userInfo);
        return Boolean.TRUE;
    }, executor);

    CompletableFuture growthFuture = CompletableFuture.supplyAsync(() -> {
        getRemoteGrowthAndFill(id, userInfo);
        return Boolean.TRUE;
    }, executor);
    CompletableFuture.allOf(userFuture, bonusFuture, growthFuture).join();

    userFuture.get();
    bonusFuture.get();
    growthFuture.get();

    return userInfo;
}
Copy the code

As a cautionary note, don’t forget to use thread pools either way. In the example I used Executor, which represents a custom thread pool to prevent excessive threads in high-concurrency scenarios.

3.2 Data heterogeneity

The user information query interface mentioned above needs to call the user query interface, integral query interface and growth value query interface, and then summarize the data to return uniformly.

So, can we make data redundant and store the data of user information, integral and growth value in a unified place, such as Redis, where the data structure is the content required by the user information query interface. Then through the user ID, directly from redis query data out, not OK?

In high concurrency scenarios, the remote interface invocation probability is removed to improve interface performance, and the data heterogeneity scheme is used to save redundant data.

However, it is important to note that data consistency issues may occur if a data heterogeneity scheme is used.

When user information, points, and growth values are updated, in most cases, they are first updated to the database and then synchronized to Redis. However, this cross-library operation may lead to inconsistent data on both sides.

4. Repeat the call

Repeated calls are common in our everyday working code, but if not well controlled, they can seriously affect interface performance.

Don’t believe me, let’s see.

4.1 Checking the Database Repeatedly

Sometimes, we need to find out from a specified set of users which already exist in the database.

The implementation code could be written as follows:

public List<User> queryUser(List<User> searchList) {
    if (CollectionUtils.isEmpty(searchList)) {
        return Collections.emptyList();
    }

    List<User> result = Lists.newArrayList();
    searchList.forEach(user -> result.add(userMapper.getUserById(user.getId())));
    return result;
}
Copy the code

If you have 50 users, you need to loop 50 times to query the database. As we all know, every query to the database is a remote call.

If you query the database 50 times, you have 50 remote calls, which is a very time consuming operation.

So, how do we optimize?

The specific code is as follows:

public List<User> queryUser(List<User> searchList) {
    if (CollectionUtils.isEmpty(searchList)) {
        return Collections.emptyList();
    }
    List<Long> ids = searchList.stream().map(User::getId).collect(Collectors.toList());
    return userMapper.getUserByIds(ids);
}
Copy the code

Provide an interface to batch query users according to the user ID set, only remote call once, can query all data.

One caveat here is that the size of the ID collection is limited and it is best not to request too much data at once. You are advised to limit the number of records for each request to less than 500.

4.2 an infinite loop

Some friends may feel a little surprised to see this title.

Shouldn’t endless loops be avoided in your code? Why is there still an endless loop?

Sometimes we write the loop ourselves, as in this code:

while(true) {
    if(condition) {
        break;
    }
    System.out.println("do samething");
}
Copy the code

This uses a while(true) loop, which is more commonly used in CAS spinlocks.

When condition equals true, the loop is automatically exited.

If the condition is very complex, if the judgment is incorrect, or if there are fewer logical judgments, you can have an endless loop in some scenarios.

There is an infinite loop, most likely caused by a developer’s artificial bug, but this situation is easy to detect.

There is also a deep hidden loop, caused by sloppy code writing. If normal data is used, the problem may not be detected, but once abnormal data occurs, an infinite loop will immediately appear.

4.3 Infinite recursion

If you want to print all the parents of a category, you can do this recursively:

public void printCategory(Category category) {
  if(category == null 
      || category.getParentId() == null) {
     return;
  } 
  System.out.println("Parent category name:"+ category.getName());
  Category parent = categoryMapper.getCategoryById(category.getParentId());
  printCategory(parent);
}
Copy the code

Normally, this code is fine.

However, if someone mistakenly points a parentId of a classification to itself, infinite recursion will occur. As a result, the interface cannot return data and eventually a stack overflow occurs.

When writing a recursive method, you are advised to set a recursive depth. For example, if the maximum classification level is 4, the recursive depth can be set to 4. It then makes a judgment in the recursive method and returns automatically if the depth is greater than 4, thus avoiding an infinite loop.

5. Asynchronous processing

Sometimes, when we optimize interface performance, we need to comb through the business logic to see if there is something in the design that doesn’t make sense.

For example, a user request interface needs to perform service operations, send intra-site notifications, and record operation logs. In order to facilitate the implementation, we usually put these logic in the interface synchronous execution, is bound to affect the interface performance.

The internal interface flow diagram is as follows:This interface looks fine on the surface, but if you take a closer look at the business logic, only the business operations areThe core logicThe other functions areNoncore logic.

The principle here is that the core logic can be executed synchronously and the write library synchronously. Non-core logic, can be executed asynchronously, write library asynchronously.

In the above example, the functions of sending in-station notifications and user operation logs do not have high requirements on real-time performance. Even if the library is written late, users will only receive in-station notifications late or see user operation logs late, which has little impact on services, so it can be processed asynchronously.

There are two main types of asynchrony in general: multithreaded and MQ.

5.1 the thread pool

useThe thread poolAfter the transformation, the interface logic is as follows:The site notification and user action logging capabilities are submitted to two separate thread pools.

This allows the interface to focus on business operations, leaving the rest of the logic to threads to execute asynchronously, which gives the interface an instant performance boost.

However, there is a slight problem with using thread pools: if the server restarts, or if the function that needs to be performed is abnormal and cannot be retried, data will be lost.

So what to do about this problem?

5.2 the mq

usemqAfter the transformation, the interface logic is as follows:The in-site notification and user action logging capabilities are not really implemented in the interface, it just sends MQ messages to the MQ server. These two functions are then actually performed when the MQ consumer consumes the message.

After this transformation, the interface performance is also improved because sending MQ messages is fast and we only need to focus on the code of business operations.

6. Avoid big things

Many people like to use the @Transactional annotation to provide Transactional functionality for convenience when developing projects using the Spring framework.

Yes, using the @Transactional annotation to provide Transactional functionality in a declarative Transactional way saves a lot of code and makes development more efficient.

But it’s also easy to create a big deal and cause other problems.

Here is a diagram of the problems caused by large transactions.As shown in the figure, large transaction problems may cause interface timeouts, which have a direct impact on interface performance.

How do we optimize big things?

  1. Don’t use the @Transactional annotation
  2. Put the query (SELECT) method outside the transaction
  3. Avoid remote calls in transactions
  4. Avoid processing too much data at once in a transaction
  5. Some functions can be performed nontransactionally
  6. Some functions can be handled asynchronously

In my other post on Big Business Problems, “How can Big Business Headaches Be Solved?” , it has made a very detailed introduction, if you are interested in can have a look.

7. Lock granularity

In some service scenarios, to prevent concurrent modification of a shared data by multiple threads, data exceptions may occur.

In the concurrent scenario, multiple threads modify data at the same time, resulting in data inconsistency. Normally, we would: lock.

However, if the lock is not properly added, the granularity of the lock is too coarse, which greatly affects the interface performance.

7.1 synchronized

The synchronized keyword is provided in Java to lock our code.

It is usually written in two ways: locking on methods and locking on code blocks.

Let’s look at how to lock a method:

public synchronized doSave(String fileUrl) {
    mkdir();
    uploadFile(fileUrl);
    sendMessage(fileUrl);
}
Copy the code

The purpose of locking is to prevent concurrent creation of the same directory, which may fail the second time, affecting service functions.

But this kind of lock directly on the method, the lock granularity is a little coarse. Because the doSave methods upload files and send messages, there is no need to lock. Only create directory methods need to be locked.

We all know that file upload operations are time-consuming, and if the entire method is locked, the lock is not released until the entire method is complete. Obviously, this results in poor performance and is not worth the cost.

In this case, we can change to the code block lock, the specific code is as follows:

public void doSave(String path,String fileUrl) {
    synchronized(this) {
      if(! exists(path)) { mkdir(path); } } uploadFile(fileUrl); sendMessage(fileUrl); }Copy the code

After this transformation, the granularity of the lock is suddenly smaller, only the concurrent creation of directory function to add the lock. Creating a directory is a very fast operation, and even locking has little impact on interface performance.

Most importantly, other file upload and message sending functions can still be executed concurrently.

Of course, this is done in the stand-alone version of the service, there is no problem. However, in the current production environment, the same service is usually deployed on multiple nodes to ensure service stability. If one node fails, other node services are still available.

The multi-node deployment prevents service unavailability due to the failure of a node. At the same time, it can also share the flow of the whole system to avoid excessive system pressure.

At the same time, it also introduces a new problem: synchronized can only ensure that one node is locked effectively, but how to lock multiple nodes?

A: This requires the use of distributed locks. At present, the mainstream distributed lock includes redis distributed lock, ZooKeeper distributed lock and database distributed lock.

Because the performance of distributed ZooKeeper locks is poor, they are rarely used in real service scenarios.

Let’s talk about redis distributed locks.

7.2 Redis Distributed Lock

In distributed system, redis distributed lock is more simple and efficient, so it has become the first distributed lock and has been used in many practical business scenarios.

The pseudo-code for redis distributed locks is as follows:

public void doSave(String path,String fileUrl) {
  try {
    String result = jedis.set(lockKey, requestId, "NX"."PX", expireTime);
    if ("OK".equals(result)) {
      if(! exists(path)) { mkdir(path); uploadFile(fileUrl); sendMessage(fileUrl); }return true; }}finally{
      unlock(lockKey,requestId);
  }  
  return false;
}
Copy the code

As with synchronized, the scope of the lock is too large. In other words, the granularity of the lock is too coarse, which will lead to low efficiency of the whole method.

In fact, only when creating directories, you need to add distributed locks, the rest of the code does not need to lock at all.

So, we need to optimize the code:

public void doSave(String path,String fileUrl) {
   if(this.tryLock()) {
      mkdir(path);
   }
   uploadFile(fileUrl);
   sendMessage(fileUrl);
}

private boolean tryLock(a) {
    try {
    String result = jedis.set(lockKey, requestId, "NX"."PX", expireTime);
    if ("OK".equals(result)) {
      return true; }}finally{
      unlock(lockKey,requestId);
  }  
  return false;
}
Copy the code

The above code Narrows the scope of locking so that it only locks when the directory is created. After this seemingly simple optimization, the interface performance can be greatly improved. Who knows, there might be something unexpected. Ha, ha, ha.

Redis distributed lock although easy to use, but it is in use, there are a lot of attention to the details, hidden a lot of pits, if a little attention is not easy to step on. For more details, check out my other post, “Talking about redis distributed Locking’s 8 Big Holes.”

7.3 Distributed Database Locks

There are three types of locks in the mysql database:

  • Table lock: add lock fast, will not appear deadlock. However, the lock granularity is large, the probability of lock conflict is the highest, and the concurrency is the lowest.
  • Row lock: Lock slowly, deadlock occurs. However, the lock granularity is the smallest, the probability of lock conflict is the lowest, and the concurrency is the highest.
  • Gap locking: the cost and locking time are between table locking and row locking. Deadlocks occur, locking granularity is between table and row locks, and concurrency is moderate.

Higher concurrency means better interface performance.

So the optimization direction of database lock is:

Use row locks first, gap locks second, and table locks second.

Let’s see. Did you use it right?

8. Paging

Sometimes I will call an interface to batch query data, for example: batch query user information by user ID, and then give points to these users.

But if you query too many users at once, say 2000 users at once. Parameter passed in 2000 user ID, remote call interface, will find that the user query interface often times out.

The call code is as follows:

List<User> users = remoteCallUser(ids);
Copy the code

As we all know, the call interface gets data from the database, which needs to be transmitted over the network. If the amount of data is too large, the data acquisition speed and network transmission are limited by bandwidth will take a long time.

So, how do you optimize this situation?

A: Paging.

The request to obtain all the data at one time is changed to several times, and only a part of the user’s data is acquired each time. Finally, the data is merged and summarized.

Actually, there are two scenarios to deal with this problem: synchronous and asynchronous invocation.

8.1 Synchronous Invocation

If you need to obtain information about 2000 users in a Job, the job requires that the data can be obtained correctly and the total time required for obtaining data is not too high.

However, the time required for each invocation of the remote interface cannot exceed 500ms; otherwise, an email warning is generated.

At this point, we can call the batch query user information interface through synchronous paging.

The specific code is as follows:

List<List<Long>> allIds = Lists.partition(ids,200);

for(List<Long> batchIds:allIds) {
   List<User> users = remoteCallUser(batchIds);
}
Copy the code

In the code I used Lists. Partition from Google’s Guava tool, which is a great way to do pagination without having to write a lot of pagination code.

8.2 Asynchronous Invocation

If you want to get information about 2000 users in an interface, it needs to think about a lot more.

In addition to the time of remote interface invocation, the total time of the interface itself must be considered. The timeout cannot exceed 500ms.

Using the above synchronous paging to request a remote interface is definitely not going to work.

So, you have to use asynchronous calls.

The code is as follows:

List<List<Long>> allIds = Lists.partition(ids,200);

final List<User> result = Lists.newArrayList();
allIds.stream().forEach((batchIds) -> {
   CompletableFuture.supplyAsync(() -> {
        result.addAll(remoteCallUser(batchIds));
        return Boolean.TRUE;
    }, executor);
})
Copy the code

With the CompletableFuture class, multiple threads call the remote interface asynchronously, and the results are collectively returned.

9. Add cache

Caching is a very efficient way to solve interface performance problems.

But you can’t cache for cache’s sake. It depends on your business scenario. After all, adding caching can lead to increased interface complexity, which can lead to data inconsistencies.

In some low-concurrency scenarios, such as user orders, caching can be dispense with.

There are also some scenarios, such as where commodity classification is displayed on the home page of the mall, assuming that the classification here is the data obtained by calling the interface, but the page is not static for the time being.

If the interface to query the classification tree does not use caching and queries the data directly from the database, performance will be very poor.

So how do you use caching?

9.1 the redis cache

In general, the caches we use most are probably redis and memcached.

But for Java applications, most use Redis, so let’s take Redis as an example.

Because in a relational database, such as mysql, menus are subordinate and subordinate. A four-level classification is a subclassification of a third-level classification, which in turn is a subclassification of a second-level classification, which in turn is a subclassification of a first-level classification.

Because of this storage structure, it is not very easy to find the classification tree all at once. This requires the use of programmatic recursion, which can be time consuming if there are many categories.

Therefore, it is a very time-consuming operation to query the classification tree data directly from the database each time.

In this case, we can use the cache, and in most cases, the interface gets the data directly from the cache. Redis can be operated using mature frameworks such as Jedis and Redisson.

Jedis pseudocode is as follows:

String json = jedis.get(key);
if(StringUtils.isNotEmpty(json)) {
   CategoryTree categoryTree = JsonUtil.toObject(json);
   return categoryTree;
}
return queryCategoryTreeFromDb();
Copy the code

First check whether there is menu data from Redis according to a key. If there is, it will be converted into an object and returned directly. If no menu data is found in Redis, it will query the menu data from the database, and return if there is.

In addition, we need to have a job to query the menu data from the database every once in a while and update it to Redis, so that we can get the menu data directly from Redis every time without accessing the database.After this transformation, the performance can be quickly improved.

But this is not the best way to improve performance. There are other alternatives, and let’s look at the following.

9.2 Level-2 Cache

The above solution is based on redis cache, although Redis access is fast. But after all, it is a remote call, and the data of the menu tree is a lot, which is time-consuming in the process of network transmission.

Is there a way to get the data directly, remotely, without asking for it?

A: Use level 2 caching, which is memory-based caching.

In addition to the hand-written memory cache, there are many memory cache frameworks currently used: Guava, Ehcache, caffine, etc.

We’ll take Caffeine, which is officially recommended by Spring, as an example.

The first step is to introduce caffeine’s jar package

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.6. 0</version>
</dependency>
Copy the code

Second, configure CacheManager and enable EnableCaching

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager(a){
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        / / Caffeine configuration
        Caffeine<Object, Object> caffeine = Caffeine.newBuilder()
                // It expires at a fixed time after the last write
                .expireAfterWrite(10, TimeUnit.SECONDS)
                // Maximum number of entries in the cache
                .maximumSize(1000);
        cacheManager.setCaffeine(caffeine);
        returncacheManager; }}Copy the code

Third, use the Cacheable annotation to retrieve the data

@Service
public class CategoryService {
   
   @Cacheable(value = "category", key = "#categoryKey")
   public CategoryModel getCategory(String categoryKey) {
      String json = jedis.get(categoryKey);
      if(StringUtils.isNotEmpty(json)) {
         CategoryTree categoryTree = JsonUtil.toObject(json);
         return categoryTree;
      }
      returnqueryCategoryTreeFromDb(); }}Copy the code

Call the categoryService. GetCategory () method, first to get the data from the caffine cache, if you can get to the data, is returned to the data directly, don’t enter the method body.

If the data cannot be obtained, check the data again from Redis. If the query is found, the data is returned and put into caffine.

If no data is found, the data is fetched directly from the database and placed in the Caffine cache.

The specific flow chart is as follows:The performance of this solution is better, but the disadvantage is that if the data is updated, the cache cannot be refreshed in time. In addition, if there are multiple server nodes, there may be different data on each node.

It can be seen that while second-level caching brings performance improvements, it also brings data inconsistency problems. The use of level 2 cache must be based on actual service scenarios. It is not suitable for all service scenarios.

But the classification scenarios I have listed above are suitable for using level 2 caching. Because it belongs to user insensitive data, even if there is a little data inconsistency, it does not matter, users may not be aware of it.

10. Separate databases and tables

Sometimes, interface performance is limited by nothing but the database.

When the system develops to a certain stage, a large number of concurrent users will make a large number of database requests, occupying a large number of database connections, and causing disk I/O performance bottlenecks.

In addition, as more and more users generate more and more data, a table may not survive. Due to the large amount of data, the SQL statement queries the data even after the index is gone.

What should I do?

Answer: need to do sub-database sub-table.

As shown below:The figure splits the user library into three libraries, each containing four user tables.

If there is a user request, route to one of the user libraries according to the user ID, and then locate to a table.

There are many routing algorithms:

  • Mold according to IDFor example, if id=7 and there are 4 tables, 7%4=3, mode 3, route to user table 3.
  • Specify a range for the IDFor example, if the value of id is 0-100,000, the data is stored in user table 0; if the value of ID is 100,000-200,000, the data is stored in user table 1.
  • Consistent hash algorithm

There are two main directions: vertical and horizontal.

To be honest, the vertical direction (i.e. the business direction) is easier.

In the horizontal direction (that is, the direction of data), the role of sub-database and sub-table, in fact, is different, can not be confused.

  • depots: to solve the problem of insufficient database connection resources and disk I/O performance bottlenecks.
  • table: Is used to solve the problem that the amount of data in a single table is too large, and it takes time to query data even when the SQL statement uses the index. In addition, it can solve the problem of consuming CPU resources.
  • Depots table: solves problems such as insufficient database connection resources, disk I/O performance bottleneck, data retrieval time, and CPU resource consumption.

In some service scenarios where a large number of concurrent users need to save a small amount of data, you can separate databases instead of tables.

In some business scenarios where the number of concurrent users is small but the number of stores to be saved is large, only tables can be separated from libraries.

In some service scenarios where a large number of concurrent users need to be saved, you can divide databases into tables.

For more details, you can check out my other article, which goes into more depth “Ali 2 sides: Why is there a separate database and a separate table?”

11. Auxiliary functions

In addition to the common methods mentioned above, we need to use some auxiliary functions to optimize the interface performance problem, because they can really improve the efficiency of problem finding.

11.1 Enabling Slow Log Query

Typically, to locate performance bottlenecks in SQL, we need to enable slow query logging in mysql. SQL statements that exceed the specified time are recorded separately for future analysis and problem location.

To enable slow log query, pay attention to the following parameters:

  • slow_query_logSlow query switch
  • slow_query_log_fileSlowly Query the path for storing logs
  • long_query_timeHow many seconds will it take to log

Mysql > set mysql > set

set global slow_query_log='ON'; 
set global slow_query_log_file='/usr/local/mysql/data/slow.log';
set global long_query_time=2;
Copy the code

If the execution time of an SQL query exceeds 2 seconds, the query will be automatically recorded in slow.log.

You can also modify the configuration file my.cnf directly

[mysqld]
slow_query_log = ON
slow_query_log_file = /usr/local/mysql/data/slow.log
long_query_time = 2
Copy the code

However, this method requires the mysql service to be restarted.

Many companies send a slow query log email every morning, and developers optimize SQL based on this information.

11.2 add monitoring

In order for SQL problems to be detected in time, we need to monitor the system.

The most widely used open source monitoring system in the industry is Prometheus.

It provides monitoring and warning functions.

The architecture diagram is as follows:

We can use it to monitor the following information:

  • Interface response time
  • Invoking third-party services takes time
  • Slow The SQL query takes time
  • CPU usage
  • Memory usage
  • Disk Usage
  • Database usage

Wait…

The interface would look something like this:You can see the current QPS of mysql, the number of active threads, the number of connections, and the size of the cache pool.

If too much data is consumed by the connection pool, the interface performance will be affected.

At this time, the connection may be opened in the code and forgotten to close, or the concurrency is too large, which needs to do further investigation and system optimization.

The screenshots show only a few of its features, but if you want to know more about it, visit Prometheus’ website: Prometheus. IO /

11.3 Link Tracing

Sometimes there is a lot of logic involved in an interface, such as database lookup, redis lookup, remote invocation interface, sending MQ messages, executing business code, and so on.

The link requested by this interface is very long, and it will take a lot of time to troubleshoot one by one. At this time, we can no longer locate the problem using traditional methods.

Is there a way to solve this problem?

With distributed link tracking system: Skywalking.

The architecture diagram is as follows:Locating performance issues via Skywalking:In Skywalking you cantraceId(globally unique ID), concatenates the complete link of an interface request. You can see the time of the entire interface, the time of the remote service being called, the time of accessing the database or redis, and so on, which is very powerful.

Before this function was not available, in order to locate interface performance problems on the line, we also need to add logs to the code, manually print out the time consumption of each link, and then troubleshoot one by one.

Recently, I accidentally got a copy of the notes written by a big boss of BAT factory. All of a sudden, I got through to both my supervisor and my supervisor. I felt that the algorithm was not as difficult as I imagined.

BAT boss wrote the brush notes, let me get the offer soft

If you’ve ever used SkyWalking to troubleshoot interface performance problems, you’ll love it. For more information, visit Skywalking’s website at skywalking.apache.org/

One last word (attention, don’t fuck me for nothing)

If this article is of any help or inspiration to you, please scan the QR code and pay attention to it. Your support is the biggest motivation for me to keep writing.

Ask for a key three even: like, forward, look.

Pay attention to the public account: [Su SAN said technology], in the public account reply: interview, code artifact, development manual, time management have excellent fan welfare, in addition reply: add group, can communicate with a lot of BAT big factory seniors and learn.