Background of Nodejs project

Since 2016, Suning has used nodeJs-based rendering projects on a large scale. The architecture uses the combination of Nginx+Nodejs+PM2. Nodejs version has been upgraded from 6.0+ to 8.0+. Nodejs framework from Express to Koa2, and Nodejs performance optimization as the core, Suning in its performance improvement, also from 0 to 1, began to explore.

Preliminary optimization – CSS, JS registration and merge

Ejs template related optimization

In the NodeJS project of Suning, express framework was used at first, and then KOS framework was used with the release of Node.js 8.0 LTS version. Whether express or KOA framework, Suning uses EJS template language in project development (about EJS template language is not introduced here, interested students can search by themselves).

The performance penalty of merging CSS and JS

Ejs file, the page template is introduced in layout. Ejs by ejS include method, for example:

__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__//layout.ejs <link type="text/css"  rel="stylesheet" href="public.css" /> <script src="public.js"></script> ... include(page1); . //page1.ejs <link type="text/css" rel="stylesheet" href="page1.css" /> <script src="page1.js"></script> <h1>hello</h1>__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__Copy the code

This resolves the separation of the common part from the page’s business logic, but also introduces another problem — the placement of static resource tags in the Layout and Page1 templates. Here is the HTML page returned to the client after rendering:

__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__... <link type="text/css" rel="stylesheet" href="public.css" /> <script src="public.js"></script> </header> <body> <div class="header"></div> <link type="text/css" rel="stylesheet" href="page1.css" /> <script src="page1.js"></script> <h1>hello</h1> </body> ... __Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__Copy the code

Page2, Page3, pageN… In this way, there will be a large number of static resource reference labels in the body, which obviously does not meet our expectations. We need to control the call location of static resource labels in the page. In order to solve the above problems, Suning introduced the EJS template static resource register mechanism, and the registration steps are as follows:

A. Output placeholders using the getResource() method.

B. Use register() to register resources, for example, register(‘a.cs ‘, ‘b.js’).

C. Merge the registered static resource processes and perform the string replace operation.

The HTML page rendered by ejS template after using register method is as follows:

__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__... {{{CSS_PLACEHOLDER}}} </header> <body> <div class="header"></div> <h1>hello</h1> </body> {{{JS_PLACEHOLDER}}} ... __Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__Copy the code

“{{{CSS_PLACEHOLDER}}}” and “{{{{JS_PLACEHOLDER}}}” are placeholders for getResource() output, string replace before the server response, Replace the placeholder with the path registered in the register() method:

__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__... <link type="text/css" rel="stylesheet" href="public.css" /> <link type="text/css" rel="stylesheet" href="page1.css" /> </header> <body> <div class="header"></div> <h1>hello</h1> </body> <script src="public.js"></script> <script src="page1.js"></script> ... __Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__Copy the code

In this way, it conforms to the normal page static resource introduction location, and suning performs the function of path merging in register() method. The merged address path is as follows:

__Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__... <link type="text/css" rel="stylesheet" href="public.css,page1.css " /></header> <body> <div class="header"></div> <h1>hello</h1> </body> <script src="public.js, page1.js "></script> ... __Wed May 23 2018 10:30:20 GMT+0800 (CST)____Wed May 23 2018 10:30:20 GMT+0800 (CST)__Copy the code

This will result in fewer requests coming from the browser, and reducing page requests is also a point of performance optimization.

Caching mechanisms

After using the register mechanism, we found another problem. When the client sends each request, nodeJS service will search and replace the string before responding. If the page is complex enough, the final rendered string is large enough. There is also a performance penalty every time a string lookup and replacement is performed. The static resources referenced by the page do not change when we visit a routing address multiple times. Using this feature suning introduced the static resource caching mechanism.

When a new page request comes in, a cache lookup is performed based on the pathname of the page request address before the register method is executed. If it hits the cache, getResource() returns the cached contents and the corresponding regsiter method is not executed. Otherwise, the register() process is executed. After the introduction of caching mechanism, there are fewer registration and replacement processes in the non-first access code logic, and the corresponding page response time is also shortened. After several tests, the page response time is approximately shortened by 4-8ms.

Advanced optimization – Optimal matching of a large number of routes

In the process of developing Suning Eshopping Hong Kong station, as many as 173 static routes and 11 dynamic routes were configured in the project development due to the large number of pages in the whole site, the large number of parameter developers and the consideration of project security, the efficiency of route matching decreased significantly. The reason is to start from express source code, Express framework in the process of routing configuration method is to convert each configuration information into a regular expression, when the request to enter, match, until the match is successful.

For dynamic routes: If fuzzy matching exists in the routes, regular expressions must be used to match the routes. For static routes, that is, route expressions with fixed strings, the mapping can be matched by key value. The complexity changes from O(n) to O(1), which greatly reduces the matching time and does not increase with the increase of routes. In real code, since the architecture uses centralized routing configuration, it is easy to filter out static routes from configuration files and store them in an Object (HashMap). Then a middleware form is formed, which is equivalent to turning multiple routing middleware into one routing middleware.

Disadvantages: Compared with the original logic, the optimized scheme lacks the order of route matching, so extra attention needs to be paid during the development. However, the overall impact is little, because static routes are preferred to match and should be preferred to respond.

Higher order optimization – TPS enhancement

In the separation of front and back ends of Suning Dajuhui System, the pressure test results submitted for the first time were very poor. I suspected that some configuration was not properly configured. The data at that time was like this (16 sets of 4C4G) :

The TPS was uncomfortably low, and with Node.js 8.9.1 available at the time, it could not have been that bad in theory, and there was no significant performance drain when looking at the code. Finally, we found the cause, the EJS template configuration did not enable template caching. If template caching is not enabled, the local template file will be read from disk every time a render request is made. This disk read operation consumes a lot of CPU. It will not be noticed when it is used in normal times, but only when it is stressed out. After setting the parameters, we got a 10-fold performance improvement.

But our optimizations didn’t stop there. We set a target of 3000TPS, which meant we needed to improve rendering performance by another 50%. At this point we must find the point that affects NodeJS performance. Nodejs is characterized by single-threaded asynchronous programming, meaning that asynchronous operations have little impact on performance, while synchronous operations have a significant impact on performance.

So the first step is to check the synchronization logic in your code and see if there is any CPU consuming code. After examination, ruled out the code part of the suspect. We have to use The DevTools provided by Chrome to analyze, start the Node parameter — inspect, open chrome devTools plug-in can be analyzed by CPU profile. Aside from the inevitable CPU consumption, it turns out that some of the CPU consumption is coming from within the EJS template engine.

As you can see from the figure, there are two parts of consumption, one is from shallow copies inside the EJS template engine, and the other is from system commands to find out if a file exists. This consumption is highlighted by the heavy use of include in ejS for large party systems. If you open the ejS engine source code, you will find that although the template is cached, the include function will still execute fs.exsitsync every time. After finding the culprit, the modification is actually very simple, in the execution of the change function to determine whether there is a condition in the cache. After modification, this part consumption is reduced a lot.

The problem of shallow copy is solved through the PROTOTYPE chain of JS, which takes the incoming data Object as the prototype Object and constructs a derived Object through object. create function to achieve the purpose of the original shallow copy (modifying Object attributes inside the template does not affect the original Object, preventing the contamination of the original Object into other templates). When a derived object modifies its properties, it does not modify the properties of the object in the prototype. Instead, it creates a new property in the derived object with the same name, so it does not pollute the original object. New attributes are also only in derived objects. This optimization reduces the number of assignments.

After the above optimization and CPU profile analysis, it is found that there is still a function in the EJS engine that consumes CPU, that is getIncludePath. The purpose of this function is to convert the passed relative path to an absolute path when executing an include. The purpose is to prevent the same relative path string from being passed into nested include files that represent different templates. However, in the conversion to absolute path, the file system function is called, which causes CPU consumption.

The idea is pretty quickly to map relative paths to absolute paths, and then cache them, so you don’t have to compute absolute paths every time. Of course, this cache cannot be global, so you must create a cache for each include to avoid the same relative path ambiguity.

Original logic:

Optimized logic:

The path Map is defined in the template function scope, only the template function can access, each time the template function will have a separate Map.

After the above optimization, local pressure test has 50% performance improvement, so submit the test group to conduct online pressure test for large gatherings.

The pressure test results were excellent, from 2000tps to over 3500, an improvement of 75%. A single machine is about 220Tps, while the original Java system is about 150tps.

Total knot

The performance advantages of Nodejs system are mainly reflected in asynchronous IO, so the performance bottleneck is mainly caused by synchronous operation, so the optimization is mainly to minimize synchronous operation, appropriate use of some JS skills, in addition, the open source feature of NPM package brings convenience to optimization.

The authors introduce

Li Hao, senior front-end technology manager of Suning Tesco, mainly responsible for the development of Suning front-end and back-end separation NodeJS project. I have many years of experience in web front end, and used to be the head of front end of Tuniu Golden Service. I love front-end and have a passion for learning new technologies. I have unique insights and rich project practices in nodeJS front-end separation, KOA and other frameworks.

Thanks qin Yun for the review of this article.