HTTP proxy

There are two forms, the following is a brief introduction for you

The first type is general agent. This kind of

The HTTP proxy acts as an intermediary, being the server to the client that links to it and the client to the server. His job is to send HTTP packets between two ports.

The second type is tunnel proxy. Through his

The Body part of HTTP protocol is used to complete communication, and any APPLICATION layer protocol proxy based on TCP is realized by HTTP. This proxy uses HTTP’s CONNECT method to establish links

.

General agent

The first kind of

The web proxy works as follows:

The HTTP client sends request packets to the proxy server. The proxy server correctly processes the request and Connection (for example, Connection: keep-alive), sends the request to the server, and forwards the received response to the client.
This is a picture from the book

The Definitive HTTP Guide visually illustrates the above behavior:

Assume that the customer accesses the proxy web site

For website A, he will regard the agent as the client, and the real address used by real netizens is invisible to him, and the purpose of hiding the client IP is completely realized. You can also change the HTTP header to X-Forwarded-IP

Such a custom header tells the server who the real client is

IP.

However, the server could not verify that the custom header was actually added by the proxy, or that the client had modified the request header so that the

Be careful when retrieving an IP from an HTTP header field.

To explicitly specify the proxy for the browser, manually modify the browser or operating system Settings, or specify the proxy

Proxy Auto-configuration (PAC) files are automatically configured. Some browsers support Web Proxy Autodiscovery Protocol (WPAD). Explicitly specifying the browser proxy is commonly called forward proxy. After forward proxy is enabled, the browser modifies HTTP request packets to avoid problems of the old proxy server

.
There is also the case of access

A actually accesses the proxy. After receiving the request packet, the proxy sends A request to the server that actually provides the service and forwards the response to the browser. This situation is commonly referred to as a reverse proxy and can be used to hide server IP and port numbers. After the reverse proxy is used, you need to modify the DNS to resolve the domain name to the IP address of the proxy server. In this case, the browser cannot detect the existence of the real server and does not need to modify the configuration. Reverse proxy is one of the most common deployment methods of Web systems. For example, this blog uses the proxy_pass function of Nginx to forward browser requests to the node.js service behind.

Tunnel proxy

The second,

The principle of Web proxy is also simple:

Through the CONNECT method, the HTTP client requests the tunnel agent to create a TCP connection to any destination server and port, and blind forwards the subsequent data between the client and server.
Here’s another picture from the book

The Definitive HTTP Guide visually illustrates the above behavior:

Suppose I access through a proxy

A website, the browser first through CONNECT request, let the agent create A TCP connection to A website; Once the TCP connection is established, the proxy mindlessly forwards subsequent traffic. So this proxy, in theory, can work with any TCP-based application layer protocol, as well as the TLS protocol used by HTTPS websites. This is why such agents are called tunnels.

This article is reprinted and adapted from the author

xujinyang2018

Adapted unit rhino agent