HTTP Packet Format

For TCP, transmission is divided into two parts, the TCP header and the data part

However, HTTP packets are text-oriented. Each field in an HTTP packet is an ASCII code string, and the length of each field is uncertain. HTTP has two types of packets: request packets and response packets.

An HTTP request or response packet consists of the following contents:

  • Request header

  • HTTP header field

  • A blank line

  • Optional HTTP packet body data

The request message

HTTP request packets are divided into three parts:

  • The request line

    • Request method

    • Request the address

    • HTTP Version

  • The first line

    • Content-Type
  • A blank line

  • The entity body

The request line

The request line is the first line of the request message and consists of three parts:

  • Request methods (GET/POST/DELETE/PUT/HEAD)

  • The URI path of the requested resource

  • The HTTP version number

    GET/index. HTTP / 1.1 HTMLCopy the code

Request method

The HTTP/1.1 protocol defines eight ways to manipulate a given resource in different ways.

The method name function
GET The GET method should only be used to read data and should not be used for side effects when making display requests to specified resources.
POST Specify the resource to submit data and request the server to process it (for example, submit a form or upload a file). The data is contained in the request text. This request may create a new resource or modify an existing resource, or both.
PUT Uploads its latest content to the specified resource location.
DELETE Requests the server to remove the resource identified by request-URI.
OPTIONS Causes the server to return all HTTP request methods supported by the resource. Use * instead of the resource name to send an OPTIONS request to the Web server to test whether the server functions properly.
HEAD As with the GET method, a request is made to the server for a specified resource, except that the server does not return the text portion of the resource. The advantage of this method is that you can retrieve the information (raw information or metadata) about the resource without having to transmit the entire content.
TRACE Displays requests received by the server, mainly for testing or diagnostics.
CONNECT Reserved in HTTP/1.1 for proxy servers that can change connections to channel mode. Typically used for links to SSL encrypted servers (via an unencrypted HTTP proxy server).

Among them, the most common methods are GET and POST. If the RESful API specification is used, POST, DELETE, GET, and PUT are generally used (corresponding to adding, deleting, checking, and modifying, respectively). Here is an article about RESTful apis.

The GET and POST

The HTTP protocol never specifies a limit on the length of a GET/POST request. The restriction on GET request parameters is the source and browser or Web server, which limits the length of the URL.

To clarify this concept, we must reiterate the following points:

  • The HTTP protocol does not specify length limits for GET and POST

  • The maximum length of GET is displayed because browsers and Web servers limit the length of urIs

  • The maximum length varies from browser to browser and Web server to Web server

  • If Internet Explorer is supported, the maximum length is 2083 bytes. If only Chrome is supported, the maximum length is 8182 bytes

The nature of the

  • GET requests are similar to lookups in that the user retrieves data without having to connect to the database every time, so caching can be used

  • Unlike POST, which generally does modification and deletion, it must interact with the database, so it cannot use caching. Therefore, GET requests are suitable for request caching

Comparison of two request methods

  • From a caching perspective, GET requests are actively cached by the browser, leaving a history, whereas POST requests are not cached by default.

  • From the perspective of encoding, GET can only carry out URL encoding and can only receive ASCII characters. Chinese requires URL encoding, while POST has no restriction.

  • From the perspective of parameters, GET is generally transmitted in plaintext in URL, so it is not secure, while POST is transmitted in ciphertext in request body, which is more suitable for transmitting sensitive information.

  • From an idempotent point of view, GET is idempotent and POST is not. (Idempotent means to perform the same operation with the same result.)

  • From a TCP perspective, a GET request sends the request all at once, while a POST is split into two TCP packets, with the header part first, and if the server responds with 100 (continue), then the body part. (Except for Firefox, where POST requests only send a TCP packet)

GET submits data through a URL, and the URL itself has no limit on data. However, different browsers have restrictions on urls. For example, Internet explorer has a limit of 2KB for urls, while Chrome has a limit of 2KB. While The FireFox browser theoretically has no limits on urls, its real limits depend on the operating system itself), POST has no limits on data size (what really matters is the capacity of the server processor).

Request header

The information in the request header includes cache-specific headers (cache-control, if-Modified-since), client-side identity information (user-agent), and so on.

The request header is of the format: key: value, note that there is a space after the colon.

Accept: */* Accept-Encoding: gzip, deflate, br Accept-Language: zh-CN,zh; Q = 0.9, en. Q =0.8 Connection: keep-alive Content-Length: 21429 Content-Type: application/json Host: api.github.com Origin: https://github.com Referer: https://github.com/ User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36Copy the code

Common request headers

Request header instructions
Accept Represents the type of data accepted by the browser
Accept-Encoding Represents the data compression format accepted by the browser
Host Represents the destination address for the current request
Authorization Indicates user identity authentication information
User-Agent Represents the browser type
If-Modified-Since Indicates the time when the requested resource was last updated
If-None-Match Represents the ETag value of the most recent identification of the currently requested resource
Cookie Cookie information saved by the browser
Referer Represents the address from which the identity request is referenced

Request body

Request body is a request parameter in POST request mode, stored in the form of key = value, multiple request parameters are connected with &, if the request body, then the content-Length attribute in the request header is the Length of the request body.

POST hysj.jsp HTTP/1.1 Host: search.cnipr.com User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.9; zh-CN; Rv :1.9.1.13)Gecko/20100914 Firefox/3.5.13 (.net CLR 3.5.30729) Accept: text/ HTML,application/ XHTML + XML,application/ XML; Q = 0.9 * / *; Q = 0.8 Accept - Language: useful - cn, useful; Q =0.5 Accept-encoding: Gzip,deflate Accept-Charst: GN2312, UTF-8; Q = 0.7 *; Q =0.7 keep-alive: 300 Connection: keep-alive Referer: http://search.cnipr.com/cnipr/zljs/hyjs-biaodan-y.jsp Content-Length: 405 pageNo=0&pageSize=10&orderNum=306735659327926273&customerMobile=15626000000&startTime=2019-02-01%2000:00:00&endTime=2019 -02-25%2014:54:20&status=SUCCESS&source=WECHAT_SHOPPING&canteenId=104&refundStatus=REFUNDED&startPayTime=2019-02-01%2000 :00:00&endPayTime=2019-02-25%2014:54:47Copy the code

The request body of an HTTP request can take three different forms depending on the application scenario:

  1. Arbitrary request body: Common for mobile developers, the request body is arbitrary, the server will not parse the request body, request body processing needs to parse itself, such as POST, JSON is such

  2. Query String: Specifies the format of the Query String in the URL. Multiple key-value pairs are connected with & and values are connected with =. Only ASCII characters can be used

  3. File upload: Each field/file is divided into separate segments by the value specified by the boundary instruction in the header field Content-Type. Each segment begins with the value of — plus the boundary instruction, followed by the description header of the segment. Use a blank line after the description header and mark the end of the request with —

The key to distinguishing content-disposition from being treated as a file is whether content-disposition contains filename. Because files have different types, content-type is also used to indicate the Type of file. If the Type is unknown, the value can be application/octet-stream to indicate that the file is a binary file. If the file is not a file, the content-type can be omitted.

The response message

The format of the HTTP response is the same, except that the status line (the first line) is different from the request line of the request packet. However, the difference between the status line and the request line is distinguishable from the HTTP request line.

The HTTP response packet is divided into three parts:

  • The status line

    • HTTP Version

    • Status code

    • The phrase

  • The first line

  • A blank line

  • The entity body

The status line

Status code The corresponding information
1xx Message, indicating that the request is received and processing continues
2xx Indicates that the request has been successfully received, understood, or received
3xx Used to indicate that resources (web pages, etc.) are permanently transferred to another URL, also known as redirects
4xx Client error – The request has a syntax error or the request cannot be implemented
5xx Server side error – The server failed to fulfill a valid request

Response headers

The response header can also be used to pass some additional information.

HTTP/1.0 200 OK Content-type: application/javascript; charset=utf-8 date: Tue, 07 Mar 2017 03:06:14 GMT sever: Domain Reliability Searver content-length: 0 x-xss-protection: 1, mode=bloack x-frame-options: SAMEORIGIN alt-svc: quic=":443"; ma=2592000; V = "36,35,34"Copy the code

Common response Header

The name of the role
Date Represents the server date and time that the corresponding resource is currently sent
Transfer-Encoding Represents the encoding format of the current response resource transport entity
Set-Cookie Cookie information is set
Location Used in redirects or when creating new resources
Server Represents the server name

Response body

The body of the response is the body Content of the web page. Generally, content-Length is used in the response header to specify the Length of the response body, which is easy for the browser to receive. Chunked encoding is also used for the body information with large amount of data.