HTTP Information contained in HTTP packets

HTTP communication consists of a request from the client to the server and a response from the server back to the client. Today we’re going to look at how requests and responses work.

The HTTP message

HTTP packet: The information used for HTTP communication is called HTTP packet
Request packet: HTTP packets sent by the requesting end (client) are called request packets
Response packet: HTTP packets sent by the responder (server) are called response packets

The HTTP message itself is a string text composed of multiple lines of data (using CR+LF as a newline character). HTTP packets are generally divided into a header and a packet body. Usually, it is not necessary to have a message body.

Figure: Structure of HTTP packets

The organization of the request message and response message

Structure of request message and response message

Figure: Structure of request message (top) and response message (bottom)

Figure: Examples of request messages (top) and response messages (bottom)

The header of a request packet and a response packet consists of the following data

The request line

Contains the method used for the request, the request URL, and the HTTP version
The status line

Contains the status code, reason phrase, and HTTP version indicating the response result
The first field

Contains various headers for various conditions and attributes that represent requests and responses.

Generally, there are four types of headers, which are: general header, request header, response header and entity header.
other

May contain undefined headers (cookies, etc.) in the HTTP RFC.

Encoding increases transmission rates

HTTP can directly transmit data as it is, but can also improve the transmission rate through encoding during transmission. A large number of access requests can be efficiently handled by encoding at transport time. However, the operation of coding requires the computer to complete, so it will consume more RESOURCES such as CPU.

Difference between the packet body and entity body

Message

The basic unit of HTTP communication, consisting of an 8-bit ocTET sequence (ocTET is 8 bits), transmitted over HTTP communication.
Entity

Payload data (supplements) is transmitted as a request or response, and its content consists of an entity header and entity body.

HTTP packet body The entity body used to transmit the response to the request.

Generally, the message body is equal to the entity body. The difference between the entity body and the packet body occurs only when the content of the entity body changes during the encoding operation during transmission.

Compressed transmission of content encoding

When sending a file, it is usually compressed and then sent in order to make the capacity variable. A feature of the HTTP protocol called content encoding can do something similar.

Content encoding specifies the encoding format to be applied to entity content and keeps entity information compressed as is. The encoded entity is received and decoded by the client.

Figure: Content encoding

Common content encodings are the following. – Gzip (GNU zip) – Compress (Standard compression for UNIX systems) – Deflate (zlib) – identity (not encoding)

Split transmit block transmission code

During HTTP communication, the browser cannot display the requested page until all the encoded entity resources are transferred. When transferring large amounts of data, the browser can gradually display the page by dividing the data into multiple pieces.

This ability to block entity bodies is called Chunked Transfer Coding.

Figure: Block transfer code

Chunking transfer coding divides the entity body into parts (blocks). Each block is marked as fast in hexadecimal, while the last block of the body of the entity is marked as 0(CR+LF).

The entity body encoded using the chunking transfer is restored to the entity body before decoding by the receiving client.

There is a mechanism called Transfer Coding in HTTP/1.1, which can be transmitted in a certain encoding mode during communication, but is only defined for block Transfer encoding.

A collection of multipart objects that send a variety of data

When you send an email, you can write text in the email and add multiple attachments. This is due to the Multipurpose Internet Mail Extension (MIME) mechanism, which allows the email to handle multiple types of data, such as text, pictures, and video.

The corresponding HTTP protocol also adopts the multi-part object set. The body of a message sent can contain multiple types of entities. Usually used when uploading images or text files, etc.

The multi-part object collection contains the following objects:

multipart/form-data

Used when a Web form file is uploaded

Content-Type: multipart/form-data; boundary=AaB03x --AaB03x Content-Disposition: form-data; name="field1" Joe Blow --AaB03x COntent-Disposition: form-data; name="pices"; filename="filel.txt" Content-Type: text/plain ... (file1.txt data)... --AaB03x--Copy the code

multipart/byteranges

Status code 206 (Partial Content) used when the response packet contains multiple contents.

HTTP/1.1 206 Partial Content Date: Fri, 13 Jul 2012 02:45:26 GMT LAST-Modified: Fri, 31 Aug 2007 02:02:20 GMT Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES --THIS_STRING_SEPARATES Content-Type: application/pdf Content-Range: bytes 500-999/8000 ... (range-specified data) -- this_string_procedure content-type: application/ PDF Content-rang: bytes 7000-7999/8000... (range specified data)... --THIS_STRING_SEPARATES--Copy the code

When multi-part object sets are used in HTTP packets, content-Type must be added to the header field.

Use the Boundary string to divide various entities specified by a collection of multi-part objects. Insert the ‘–‘ flag before the start line of each entity specified by the Boundary string (e.g. –AaB03x, — this_string_procedure) and insert the ‘–‘ at the end of the string corresponding to the multi-part object set (e.g. –AaB03x–, — this_string_procedure –)

Scope request to get part of the content

In the past, users couldn’t access the Internet at today’s speeds. Downloading a larger image or file was already a struggle, and if the network crashed during the download, you had to start all over again. In order to solve the above problems, a response mechanism is needed. The so-called recovery refers to resuming the download from the previous download break.

To implement this functionality, you need to specify the entity scope to download. Like this, a request sent with a specified range is called a range request.

For a resource with a size of 10000 bytes, if a range request is used, only resources with a size between 5001 and 10000 bytes can be requested.

When performing a Range request, the header field Range is used to specify the byte Range of the resource. The byte range is specified as follows:

From 5001 to 10000 bytes
```
Range: bytes=5001-10000
Copy the code
```
From 5001 bytes onwards
```
Range: bytes=5001-
Copy the code
```
From the beginning to multiple ranges of 3000 bytes and 5000-7000 bytes
```
Range: bytes=-3000, 5000-7000
Copy the code
```

For a range request, a 206 Partial Content response message is returned. In addition, for multi-scope scope requests, the response returns a response message after the header field Content-Type table name multiPAR/Byteranges.

If the server is unable to respond to a range request, the status code 200 OK and the complete entity content are returned.

Content negotiation returns the most appropriate content

Multiple pages with the same content may exist on the same Web site. The English and Chinese Web pages, for example, are identical in content but not in the same language.

If the default language of the browser is English or Chinese, the English or Chinese version of the Web page is displayed when you access the same URL. Such a mechanism is called Content Negotiation.

Photo: Visit www.google.com/

Content negotiation mechanism means that the client and the server negotiate the resource content of the response, and then provide the most suitable resource to the client. Content consultation is based on the language, character set, encoding method of the responding resource.

Some of the header fields contained in the request message (below) are benchmarks for judgment.

Accept
Accept-Charset
Accept-Encoding
Accept-Language
Content-Language

There are three types of content negotiation techniques

Server-driven Negotiation

Content negotiation is performed by the server. The header field of the request is used as a reference and processed automatically on the server side. But for users, judging by what the browser sends is not always a good way to filter out the best content.
Client-driven Negotiation

Content negotiation by the client. The user manually selects from the list of options displayed in the browser. You can also make this selection automatically on a Web page using JavaScript scripts. For example, according to the OS type or browser type, the page is automatically switched to PC version or mobile phone page.
Transparent Negotiation

It is a combination of server-driven and client-driven, and a method of content negotiation by the server-side and client-side respectively.

Note: Excerpt from Illustrated HTTP