The HTTP protocol is used for communication between clients and servers

HTTP, like many other protocols in the TCP/IP protocol family, is used for communication between clients and servers.

The segment that requests access to resources such as text or images is called the client side, and the end that provides the resource response is called the server side

Figure: When using THE HTTP protocol, one segment must act as the client and the other as the server

When two computers communicate using HTTP protocol, one end of a communication line must be the client and the other end is the server.

Communication is achieved through the exchange of requests and responses

Figure: The request must be made by the client, and the server replies with the response

According to the HTTP protocol, a request is made from the client, and the server responds to the request and returns.

Let’s look at a concrete example:

The following is the content of a request packet sent from a client to an HTTP server.

GET /index.htm HTTP/1.1
Host: hackr.jp
Copy the code
  • The GET at the beginning of the opening line indicates the type of server being requested, called a method.
  • The subsequent string “/index.html” identifies the resource object requested, also known as the request URL (request-URL)
  • The final HTTP/1.1, or HTTP version number, is used to indicate the HTTP protocol functionality used by clients

This is a request to access the /index.html page resource on an HTTP server.

A request message consists of the request method, request URL, protocol version, optional request header field, and content entity.

Figure: Composition of the request message.

After receiving the request, the server processes the request content and returns the processing result in the form of a response.

HTTP/1.1 200 OK Date: Tue, 10 Jul 2012 06:50:15 GMT Content-Length: 362 Content-Type: text/ HTML < HTML >Copy the code
  • HTTP/1.1: indicates the HTTP version of the server.
  • 200 OK: Indicates the status code and reason phrase (Reason-phrase) for the result of the request.
  • Date: A line showing the Date and time when the response was created, as an attribute in the header field (Hader field)
  • The content delimited by a blank line is called the entity body of the resource entity.

Figure: Composition of response message

HTTP is a protocol that does not save state

HTTP is a stateless protocol that does not save state. The HTTP protocol itself does not store the state of communication between requests and responses. That is, at the HTTP level, the protocol does not persist requests that have been sent or the corresponding response.

Figure: THE HTTP protocol itself does not have the ability to save previously sent requests or responses

However, with the continuous development of the Web, not saving the state has not met the demand, in order to achieve the desired function of maintaining the state, the introduction of Cookie technology. With cookies and HTTP communication, you can manage state.

Request the URl to locate the resource

HTTP uses URLS to locate resources on the Internet. Because of the specific function of URLS, resources can be accessed anywhere on the Internet.

Figure: THE HTTP protocol uses urls to let clients locate server-side resources

There are many ways to specify a URL:

Figure: Take hackr.jp/index.html as an example of a request

In addition, if a request is made to the server itself instead of a specific resource, an * can be used instead of the request URL. The following example queries the types of HTTP methods supported by the HTTP server.

The OPTIONS * HTTP / 1.1Copy the code

HTTP method to inform the server of the intent

GET: Obtains resources

The GET method is used to request access to resources that have been identified by the URL. The specified resource is parsed by the server and the response content is returned. That is, if the requested resource is text, return it as is; If it is a program like CGI (Common GateWay Interface), it returns the output after execution.

Figure: An example of a request-response using the GET method

Request:

GET/index. HTTP / 1.1 HTML Host: www.hackr.jpCopy the code

Response:

Returns the page resource for index.htmlCopy the code

Request:

GET /index.html HTTP/1.1 Host:www.hackr.jp if-modified-since :Thu,12 Jul 2012 07:30:00 GMTCopy the code

Response:

Returns only the index.html page resources that have been updated since 7.30 on July 12, 2012. If the content is Not updated, the response is returned with the status code 304 Not ModifiedCopy the code

POST: transmits the entity body

While it is possible to transfer the body of an entity using the GET method, the GET method is typically used instead of the POST method.

Request – response example using the POST method

Request:

POST /submit.cgi HTTP/1.1 Host:www.hackr.jp Content-Length:1560Copy the code

Response:

Return submit.cgi to receive the result of processing the dataCopy the code

PUT: transfers files

The PUT method is used to transfer files, just like file uploading over FTP. It requires that the body of the request packet contains the file content, and then saves the file to the location specified in the request URL. However, since HTTP/1.1’s PUT method does not have an authentication mechanism of its own, anyone can upload files, which is a security issue because it is not available on the average Web site. The PUT method may be open to use if it works with the validation mechanisms of a Web application, or if comparable Web sites are architased using REpresentational State Transfer (REST) standards.

Example of a request-response using the PUT method

Request:

HTML HTTP/1.1 Host:www.hackr.jp Content-Type:text/ HTML Content_length: 1560(1560 bytes of data)Copy the code

Response:

The response returns a status code 204 No Content.Copy the code

DELETE: deletes a file

The DELETE method is used to DELETE a file, which is the opposite of PUT. The DELETE method deletes the specified resource at the requested URL

However, the HTTP/1.1 DELETE method itself, like the PUT method, does not have an authentication mechanism, so common Web sites do not use the DELETE method. It is still possible to open up for use when combined with Web application validation mechanisms, or when REST standards are adhered to.

Request using the DELETE method — Sample request in response:

DELETE/example. HTTP / 1.1 HTML Host: www.hackr.jpCopy the code

Response:

The response returns a status code 204 No Content.Copy the code

HEAD: obtains the packet HEAD

The HEAD method is the same as the PUT method, but does not return the subject part of the packet. It is used to confirm the validity of URL and the date and time of resource update.

Figure: Like GET, but does not return the message subject

Example of a request-response using the HEAD method

Request:

HEAD /index.html HTTP/1.1
Host: www.hackr.jp
Copy the code

Response:

Returns the response header associated with index.htmlCopy the code

OPTIONS: Asks for supported methods

The OPTIONS method is used to query the supported methods for the resource specified for the requested URL.

Requests using the OPTIONS method — Sample requests in response:

The OPTIONS * HTTP / 1.1 Host: www.hackr.jpCopy the code

Response:

HTTP / 1.1 200 OK Allow: GET, POST, HEAD, the OPTIONS (back to the server support method)Copy the code

TRACE: indicates a tracing path

The TRACE method is a way for the Web server to return the previous request traffic back to the client.

At the time of sending the request, the forward field of max-forwards is filled with a value. At each end of the request, the value is reduced by one. When the value reaches zero, the transmission is stopped.

The client can TRACE how the outgoing request was modified or tampered with. This is because requests that want to level up to the source target server may go through the proxy, and the TRACE method is used to confirm the sequence of operations that took place during the connection.

However, TRACE methods are not commonly used, especially since they are prone to XST (cross-site Tracing) attacks.

A request-response example using the TRACE method

Request:

TRACE / HTTP/1.1
Host:hackr.jp
Max-Forwards:2
Copy the code

Response:

HTTP/1.1 200 OK Content-Type:message/ HTTP Content-Length:1023 TRACE/HTTP/1.1 Host:hackr.jp max-forward :2(Return response containing request contents)Copy the code

CONNECT: The tunnel protocol is required to CONNECT to the agent

The CONNECT method requires that a tunnel be established when communicating with the proxy server to implement TCP communication using the tunnel protocol. Secure Scokets Layer (SSL) and Transport Layer Security (TLS) protocols are used to encrypt communications and then transmit them through network tunnels.

The format of the CONNECT method is as follows:

CONNECT Proxy server name: port number HTTP versionCopy the code

An example of a request response using the CONNECT method

Request:

The CONNECT proxy. Hackr. Jp: 8080 HTTP / 1.1 Host: proxy. Hackr. JpCopy the code

Response:

HTTP/1.1 200 OK (after the network tunnel)Copy the code

Usage Command output

When a request message is sent to a resource specified by the request URL, a command called a method is used. The power of the method is to specify that the requested resource produces the desired behavior. Methods include GET,POST, HEAD, and so on

The following table lists the methods supported by HTTP/1.0 and HTTP/1.1. In addition, method names are case sensitive, so use uppercase letters.

Table 2-1: Methods supported by HTTP/1.0 and HTTP/1.1

methods instructions Supported HTTP version
GET Access to resources 1.0, 1.1,
POST Transport entity body 1.0, 1.1,
PUT Access to the file 1.0, 1.1,
HEAD Get message header 1.0, 1.1,
DELETE Delete the file 1.0, 1.1,
OPTIONS Ask for supported methods 1.1
TRACE The final path 1.1
CONNECT A tunnel protocol is required to connect the agent 1.1
LINK Establish relationships with resources 1.0
UNLINK Disconnection relation 1.0

Of the many methods cited here, LINK and UNLINK have been deprecated by HTTP/1.1 and are no longer supported.

Persistent connections save traffic

In the original version of the HTTP protocol, TCP connections were disconnected for every HTTP communication.

In the case of communications in those days, it was all very small text transfers, so even this was not a problem. However, with the popularity of HTTP, it has become more common for documents to contain large numbers of images.

For example, when using a browser to browse an HTML page containing multiple images, the user will send a request to access the resources of the HTML page and also request other resources contained in the HTMLL page. Therefore, each request causes unnecessary TCP connection establishment and disconnection, increasing the traffic overhead.

A persistent connection

To solve the above TCP connection problem, HTTP/1.1 and some HTTP/1.0 came up with HTTP Persistent Connections. Also known as HTTP keep-alive or HTTP Connection reuse). The characteristic of a persistent connection is that the TCP connection remains as long as neither end explicitly disconnects.

Figure: Persistent connections are designed to interact with multiple requests and responses after a SINGLE TCP connection is established

The benefits of persistent connections are that they reduce the overhead caused by the repeated establishment and disconnection of TCP connections and reduce the load on the server side. In addition, the reduced overhead allows HTTP requests and responses to end earlier, so that Web pages display faster and respond faster.

In HTTP/1.1, all connections were persistent by default, but they were not standardized in HTTP/1.0. Although some servers implement persistent connections through non-standard means, the server side does not necessarily support persistent connections. It goes without saying that clients need to support persistent connections in addition to the server side.

pipelines

Persistent connections make it possible for most requests to be sent pipelining. After sending the previous request, wait and receive the response before sending the next request. With the advent of pipelining, the next request can be sent directly without waiting for a response.

This makes it possible to send multiple requests simultaneously in parallel without having to wait for one response after another.

Figure: Send the next request without waiting for a response

For example, when requesting an HTML Web page with 10 images, using persistent connections can end the request faster than connecting one by one. Pipelining is faster than persistent connections. The more requests there are, the more significant the time difference becomes.

State management using cookies

HTTP is a stateless protocol that does not manage the status of previous requests and responses. That is, the request cannot be processed based on the previous state.

If the Web page that requires login authentication cannot manage the login status (the login status is not recorded), you need to log in to the Web page again each time or add parameters in each request packet to manage the login status.

To be sure, stateless protocols certainly have their advantages. This reduces CPU and memory resource consumption on the server by eliminating the need to save state. On the other hand, HTTP protocol itself is very simple, so it is used in a variety of scenarios.

Figure: If you let the server manage all the client state, it becomes a burden

While preserving the feature of stateless protocol, Cookie technology is introduced to solve similar contradictory problems. Cookie technology controls client status by writing Cookie information in request and response packets.

The Cookie is saved by the client based on the information of the header field called set-cookie in the response packet sent from the server. When the client sends a request to the server next time, the client automatically adds the Cookie value to the request packet and sends the request packet.

After sending the Cookie sent by the client, the server will check which client sent the connection request, and then compare the records on the server to obtain the previous status information.

  • Request in the state without Cookie information

  • Request after the second time (with Cookie information state)

The figure above shows the Cookie sending interaction. The contents of HTTP request packets and response packets are as follows.

  1. Request message (status without Cookie information)

    GET /reader/ HTTP/1.1 Host: hackr.jp * header does not contain information about cookiesCopy the code
  2. Response message (Server generates Cooki information)

    HTTP/1.1 200 OK Date: Thu, 12 Jul 2012 07:12:20 GMT Server: Apache < set-cookie: sid=1342077140226724; path=/; expires=Web,10-Oct-12 07:12:20 GMT> Content-Type: text/plain; charset=UTF-8Copy the code
  3. Request message (automatically sending saved Cookie information)

    GET /image/ HTTP/1.1
    Host: hackr.jp
    Cookie:sid=1342077140226724
    Copy the code

    For the header fields corresponding to cookies in request and response packets, refer to the following sections.

Note: Excerpt from Illustrated HTTP