The year was 2012. PHP and Ruby on Rails have gained prominence as the top server-side technologies for rendering Web applications. But a bold new competitor is taking things by storm – someone who can handle 1M concurrent connections. The technology is nothing more than Node.js and has been growing steadily ever since.

Unlike most competing technologies at the time, Node.js had a Web server built in. Owning this server means that developers can bypass countless configuration files, such as the layered collection.htaccess of the file php.ini. Having a built-in Web server provides other benefits, such as the ability to process files when uploading them and the ease of implementing WebSockets.

Node.js-powered Web applications happily handle billions of requests every day. Most of the largest companies in the world are powered in some way by Node.js. To say Node.js is production-ready is of course an understatement. However, one piece of advice has been true since Node.js was born: Node.js processes should not be exposed to the Web directly, but should be hidden behind a reverse proxy. But before we figure out why we should use a reverse proxy, let’s first look at what it is.

What is a reverse proxy?

A reverse proxy is basically a special type of Web server that takes requests, forwards them to another HTTP server somewhere else, receives the reply, and forwards the reply to the original requester.

However, a reverse proxy usually does not send the exact request. Usually, it modifies the request in some way. For example, if the reverse proxy serves www.example.org:80 and wants to forward the request to ex.example.org:8080, it might override the original Host header to match the target header. It can also modify requests in other ways, such as cleaning up ill-formed requests or converting between protocols.

Once the reverse proxy receives the response, it can transform the response in some way. Again, a common approach is to modify the Host header to match the original request. The body of the request can also be changed. A common modification is to perform gzip compression on the response. Another common change is to enable HTTPS support when the underlying service only supports HTTP.

A reverse proxy can also dispatch incoming requests to multiple back-end instances. If the service is exposed in api.example.org, the reverse proxy can forward requests to api1.internal.example.org, api2, etc

There are many different reverse proxies out there. Two of the more popular ones are Nginx and HAProxy. Both tools can perform gzip compression and add HTTPS support, and they also focus on other areas. Nginx is the more popular of the two options and has some other useful features, such as the ability to serve static files from the file system, so we’ll use it as an example in this article.

Now that we know that the reverse proxy is

what
why

Why should I use a reverse proxy?

SSL termination

SSL termination is one of the most common reasons for using a reverse proxy. It’s a bit more work to change those applications from the protocol HTTP to HTTPS rather than append s. Node. Js itself

is

https

However, configuring the protocol used to communicate with our application and managing expired SSL certificates is not a concern for our application. Checking certificates into the codebase is not only tedious, but also a security risk. Getting certificates from a central location at application startup is also risky.

Therefore, it is best to perform SSL termination outside of the application, usually in a reverse proxy. Thanks to certbot technologies like Let’s Encrypt, maintaining certificates with Nginx is as simple as setting up a CRon job. Such a job automatically installs a new certificate and dynamically reconfigures the Nginx process. This is a less disruptive process and then restarts each Node.js application instance.

Also, by allowing the reverse proxy to perform SSL termination, this means

only
In order to

Probably a malicious module

Gzip compression

Gzip compression is another feature that should be unloaded from the application to the reverse proxy. Gzip compression policies are best set at the organizational level and do not have to be specified and configured for each application.

It’s a good idea to use some logic when deciding on the contents of gzip. For example, files that are very small, perhaps less than 1KB, may not be worth compressing because the gzip compressed version may sometimes be larger, or the CPU overhead of getting the client to decompress the file may not be worth it. Also, when working with binary data, depending on the format, it may not benefit from compression. Gzip is also something that can’t be simply enabled or disabled, requiring checking the incoming Accept-Encoding header for compatible compression algorithms.

cluster

JavaScript is a single-threaded language, so Node.js has traditionally been a single-threaded server platform (however, the current experimental worker thread support in Node.js V10 aims to change that). This means that getting as much throughput as possible from node.js applications requires running roughly the same number of instances as the CPU core.

Node.js does this with the built-in Cluster module. The incoming HTTP request is sent to the main process and dispatched to the cluster worker.

However, scaling cluster workers dynamically requires some effort. Running additional Node.js processes in the schedule main process often adds overhead. In addition, scaling across different machines is something a cluster cannot do.

For these reasons, it is sometimes best to use a reverse proxy to dispatch requests to run node.js processes. These reverse proxies can be dynamically configured to point to them when new applications arrive. In fact, the application should focus on its own work; it should not be concerned with managing multiple copies and dispatching requests.

Enterprise routing

When working with large Web applications, such as those built by multi-team enterprises, it is useful to use reverse proxies to determine where to forward requests. For example, example.org/search/* can route outgoing requests to an internal search application, while example.org/profile/* can dispatch additional requests to an internal profile application.

Such tools allow for other powerful features such as sticky sessions, blue/green deployment, A/B testing, and more. I personally work in a code base and execute this logic in an application, which makes the application difficult to maintain.

Performance advantage

Node.js is very malleable. It provides static resources from the file system, performs gzip compression using HTTP responses, has built-in SUPPORT for HTTPS, and many other features. It can even run multiple instances of the application through modules and perform its own request scheduling cluster.

Ultimately, however, it is in our best interest to have a reverse proxy handle this operation for us, rather than having our Node.js application perform it. In addition to each of the reasons listed above, another reason to want to do this outside of Node.js is efficiency.

SSL encryption and gZIP compression are two highly CPU-bound operations. Specialized reverse proxy tools, such as Nginx and HAProxy, typically perform these operations faster than Node.js. Web servers like Nginx can also read static content from disk faster than Node.js. Even clustering can sometimes be more efficient because a reverse proxy like Nginx will use less memory and CPU than other Node.js processes.

But don’t take our word for it. Let’s make some benchmarks!

Use the following load test siege. We run the command with a concurrent value of 10 (the user who made 10 requests at the same time), and the command runs until 20,000 iterations (for 200,000 total requests).

In order to check the memory, we pmap < pid > | grep total in benchmark run throughout the life of a few times, then the average results. When running Nginx with a single worker thread, you end up running two instances, one on the master server and one on the worker server. And then we add these two values. When the Node.js cluster is running at 2, there will be 3 processes, one for the main process and two for the worker process. The approximate memory columns in the following table are the sum of each Nginx and Node.js procedure for a given test.

Here are the results of the benchmark:

Benchmark results

In the Node-Cluster benchmark, we use two workers. This means there are three Node.js processes running: one master and two workers. In the nginx-Cluster-Node benchmark, we ran two Node.js processes. Each Nginx test has an Nginx master server and an Nginx worker process. Benchmarks involve reading files from disk, and neither Nginx nor Node.js is configured to cache files in memory.

Performing SSL terminations for Node.js using Nginx results in an increase in throughput of approximately 16% (749rps to 865rps). Performing gzip compression with Nginx results in a throughput increase of approximately 50% (5,047 RPS to 7,590 RPS). Using Nginx to manage the process cluster resulted in a performance loss of approximately -1% (8,006 RPS to 7,908 RPS), possibly due to the overhead of passing additional requests on loopback network devices.

Basically, the memory usage of a single Node.js process is about 600MB, while the memory usage of an Nginx process is about 50MB. These can fluctuate slightly depending on the functionality used, for example, Node.js uses an additional ~13MB when performing SSL terminations, and Nginx uses an additional ~4MB when acting as a reverse proxy to serve static content from the file system. One thing to note is that Nginx uses a consistent amount of memory throughout its life cycle. However, due to the garbage collection nature of JavaScript, Node.js is constantly fluctuating.

Here is the software version used to perform this benchmark:

  • Nginx:1.14.2
  • Node. Js:10.15.3
  • City:3.0.8

The test was performed on a machine with 16GB of ram, i7-7500U CPU 4×2.70GHZ Linux and Linux kernel 4.19.10. All the necessary files to recreate the above benchmark are available here: IntrinsicLabs/nodejs-reverse-proxy-benchmarkmarks.

Simplified application code

Benchmarking is great, but in my opinion, the biggest benefit of offloading work from Node.js applications to reverse proxies is the simplicity of the code. We can reduce the number of lines of potentially faulty imperative application code and swap them for declarative configuration. Developers generally believe that they have more confidence in code written by an outside team of engineers, such as Nginx, than in code written by themselves. Instead of installing and managing gZIP compression middleware and keeping it up to date in various Node.js projects, we can configure it in one place. Instead of shipping or downloading SSL certificates and reacquiring or restarting the application process, we can use existing certificate management tools instead. Instead of adding conditions to our application to check whether the process is the main or worker process, we can uninstall it to another tool. Reverse proxy allows our application to focus on business logic and forget about protocol and process management.

Although Node.js is perfectly capable of running in production, using a reverse proxy with a production HTTP Node.js application provides numerous benefits. Operations such as SSL and gZIP become faster. The administration of SSL certificates can be made easier. The amount of application code required is also reduced. I strongly recommend using a reverse proxy for your next production Node.js application.

英文原文

See more articles

Public id: Galaxy 1

Contact email: [email protected]

(Please do not reprint without permission)