You must have heard of Nginx. If not, you must have heard of its peer Apache.

The generation of Nginx

Nginx is a Web server like Apache. Based on the REST architecture style, communication is based on Uniform Resources Identifier URIs or Uniform Resources Locator urls. Provides various network services through HTTP protocol.

However, these servers were limited by the environment at the beginning of design, such as the user scale, network bandwidth, product characteristics and other limitations, and their positioning and development are not the same. This makes each Web server unique.

Apache has a long history and is the undisputed number one server in the world. It has many advantages: stable, open source, cross-platform, etc.

It’s been around too long, and it came at a time when the Internet industry was nowhere near what it is now. So it was designed to be a heavyweight.

It does not support high-concurrency servers. Running tens of thousands of concurrent accesses on Apache can cause the server to consume a lot of memory.

Switching between processes or threads by the operating system also consumes a lot of CPU resources, resulting in a decrease in the average response speed of HTTP requests.

These all decided that Apache could not become a high performance Web server, lightweight high concurrency server Nginx came into being.

Igor Sysoev, a Russian engineer, developed Nginx in C while working for Rambler Media.

Nginx has been providing Rambler Media with excellent and consistent service as a Web server. Then Igor Sysoev made Nginx code open source and licensed it under a free software license.

Nginx is popular for several reasons:

  • Nginx uses an event-based architecture that allows it to support millions of TCP connections.
  • A high degree of modularity and free software licenses allow third-party modules to proliferate (this is the age of open source).
  • Nginx is a cross-platform server that runs on Linux, Windows, FreeBSD, Solaris, AIX, Mac OS and other operating systems.
  • These excellent designs bring great stability.

Where Nginx comes in

Nginx is a free, open source, high-performance HTTP server and reverse proxy server. It is also an IMAP, POP3, and SMTP proxy server.

Nginx can be used as an HTTP server for web site publishing and as a reverse proxy for load balancing.

About the agent

When it comes to agent, first of all, we should make clear a concept. The so-called agent is a representative and a channel. At this point, there are two roles involved, one is the agent role, one is the target role.

The process by which the agent accesses the target role through the agent to complete some tasks is called the agent operation process. Just like the exclusive shop in life, when a customer goes to adidas to buy a pair of shoes, the exclusive shop is the agent, the agent is the manufacturer of Adidas, and the target role is the user.

Forward agent

Before saying reverse proxy, we first look at the forward proxy, forward proxy is also the most common contact with the proxy mode, we will from two aspects about the processing mode of forward proxy, respectively from the software and life to explain what is called forward proxy.

In today’s network environment, if we need to visit some foreign websites due to technical needs, at this time you will find a website located in a foreign country we can not access through the browser.

At this time we may use an operation FQ for access, FQ is mainly to find a proxy server that can access foreign websites, we will send the request to the proxy server, the proxy server to visit foreign websites, and then access to the data transfer to us!

This proxy mode is called forward proxy. The biggest characteristic of forward proxy is that the client is very clear about the server address to access. The server only knows which proxy server the request is coming from, not which specific client. Forward proxy mode masks or hides real client information.

Here’s a schematic (I put the client and forward proxy boxes together, both in the same environment, which I’ll describe later) :

The client must set up the forward proxy server, of course, if you know the IP address of the forward proxy server, and the port of the proxy program.

The diagram below:

To summarize: forward proxy, “it proxies the client”, is a Server located between the client and the Origin Server, in order to get content from the original Server, the client sends a request to the proxy and specifies the destination (the original Server).

The proxy then forwards the request to the original server and returns the obtained content to the client. Clients must make some special Settings to use forward proxies.

Forward proxy uses:

  • Access previously inaccessible resources, such as Google.
  • Can do cache, speed up access to resources.
  • Authorize client access and authenticate Internet access.
  • The proxy can record user access records (online behavior management) and hide user information externally.

The reverse proxy

Understand what is the forward proxy, we continue to look at the processing of the reverse proxy, for example, China’s a treasure website, every day at the same time connected to the site of the number of visitors has exploded, a single server is far from meeting the people’s growing desire to buy.

At this point, there is a familiar term: distributed deployment; This is to solve the problem of limited access by deploying multiple servers.

Most of the functionality of a web site is also directly using Nginx reverse proxy implementation, and by encapsulating Nginx and other components after a lofty name: Tengine.

For more information, please visit Tengine’s website:

http://tengine.taobao.org/
Copy the code

So the reverse proxy is through what way to achieve the distributed cluster operation, let’s first look at a schematic diagram (I put the server and reverse proxy box together, belong to the same environment, I will introduce later) :

From the diagram above, you can see clearly that the requests sent by multiple clients to the server are received by the Nginx server and then distributed to the back-end business processing server for processing according to certain rules.

At this point, the source of the request (i.e. the client) is clear, but it is not clear which server is handling the request. Nginx acts as a reverse proxy.

The client is not aware of the existence of proxy, the reverse proxy is transparent, visitors do not know that they are accessing a proxy. Because the client doesn’t need any configuration to access it.

Reverse proxy, “it proxies the server”, mainly used in distributed deployment of server cluster, reverse proxy hides server information.

Reverse proxy functions:

  • To ensure Intranet security, the reverse proxy is usually used as the IP address for accessing the public network, and the Web server is the Intranet.
  • Load balancing, using a reverse proxy server to optimize the load on your website.

Project scene

Under normal circumstances, the forward proxy and reverse proxy may exist in the same application scenario in the actual project operation. The forward proxy client requests to access the target server, which is a reverse proxy server, and the reverse proxy has multiple real business processing servers.

The topology diagram is as follows:

A screenshot illustrates the difference between forward and reverse proxies, as shown below:

Illustration:

  • In forward Proxy, the Proxy and Client belong to the same LAN (in the box in the figure) and the Client information is hidden.
  • In reverse Proxy, the Proxy and Server belong to the same LAN (in the box in the figure) and the Server information is hidden.

In both cases, a Proxy sends and receives requests and responses on behalf of the server, but the structure of the Proxy is reversed, so the latter is called a reverse Proxy.

Load balancing

Now that we’ve defined the concept of a proxy server, what rules does Nginx use to distribute requests as it acts as a reverse proxy server? Can the distribution rules be controlled for different project application scenarios?

The number of requests sent by the client and received by the Nginx reverse proxy is what we call the load.

The number of requests is distributed according to a certain rule, and the rule that the requests are processed by different servers is a kind of balancing rule.

Therefore, the process of distributing the requests received by the server according to the rules is called load balancing.

Load balancers in actual project operations, there are hardware load balancers and software load balancers. Hardware load balancers, such as F5 load balancers, are expensive in cost.

But the stability and security of data and so on have very good protection, such as China Mobile China Unicom such companies will choose hard load operation.

For cost reasons, more companies are opting for software load balancing, which is a message queue distribution mechanism using existing technology combined with host hardware.

Nginx supports the following load balancing scheduling algorithms:

① Weight polling (default) : The received requests are allocated to different backend servers one by one in order. Even if a backend server is down during the use of Nginx, the server will be automatically removed from the queue, and the request processing is not affected.

In this way, a weight value can be set for each backend server to adjust the rate of requests distributed on each server.

The greater the weight data, the greater the probability of being assigned to the request; The weight value is adjusted according to the hardware configurations of back-end servers in the actual working environment.

② IP_hash: Each request is matched according to the hash result of the IP address of the initiating client. In this algorithm, the next client with a fixed IP address will always access the same back-end server, which also solves the problem of Session sharing in cluster deployment environment to some extent.

(3) Fair: intelligent adjustment scheduling algorithm, dynamic allocation according to the back-end server request processing to response time.

The server with short response time and high processing efficiency has a high probability of allocating requests, while the server with long response time and low processing efficiency has a low probability of allocating requests. It is a scheduling algorithm combining the advantages of the former two.

However, it is important to note that Nginx does not support fair by default. If you want to use this scheduling algorithm, install the upstream_fair module.

④ URl_hash: Allocate requests based on the hash result of the accessed URL. Each REQUEST URL points to a fixed back-end server, which can improve cache efficiency in the case of Nginx as a static server.

Also note that Nginx does not support this scheduling algorithm by default. To use this algorithm, you need to install the Nginx Hash package.

Source:url.cn/5BsRSKU