As a developer, you use the HTTP protocol every day, but have you looked into the details of its composition and features? This article will take you a comprehensive comb. Take a look at the brain map of knowledge points involved in this article.

The original brain map file can be obtained by replying to “HTTP” in the public account “Program New Horizon”. Next, look at the specific HTTP protocol introduction.

HTTP Protocol Overview

HyperText Transfer Protocol (HTTP) is an application-layer Protocol for distributed, collaborative and hypermedia information systems. It is the basis of data communication on the World Wide Web.

Basic HTTP format:

Protocol :// Server IP:[port]/ path /[? Query]Copy the code

RFC 2616, published in June 1999, defines HTTP 1.1, the currently widely used HTTP protocol. The HTTP/2 standard was officially published as RFC 7540 in May 2015, replacing HTTP 1.1 as the implementation standard of HTTP.

HTTP Protocol Overview

HTTP is a client-server request response standard (TCP). You can make an HTTP request to a specified port on the server (default: 80) through a browser or other tool.

HTTP is widely used over TCP/IP, but TCP/IP is not required. HTTP assumes reliable transport of the underlying protocols, so any protocol that can guarantee this will be used.

HTTP data is transmitted in plain text, which is insecure. Therefore, HTTP data can be encrypted based on HTTPS, which is usually encrypted using SSL or TLS.

Basic usage process: A client initiates an HTTP request to create a TCP connection to a specified server (port). The server listens on the corresponding port (port 80 by default), receives and processes the request, and returns a status code (for example, “HTTP/1.1 200 OK”), content, error messages, or other information.

HTTP Features

HTTP is stateless

The HTTP protocol is stateless. Each request is independent of each other and no request or response is persisted. Benefits: Processing large volumes of transactions faster, ensuring protocol scalability.

Cookie (HTTP 1.1) and Session technology can be introduced to manage state for business needs.

Multiple HTTP requests

For a web page, the request is not completed once. The client responds first to the HTML page and then loads other resources (CSS, JS, images, etc.). HTTP 2.0 supports pipelining mechanisms to request and respond to multiple requests simultaneously, greatly increasing efficiency.

There is no connection

HTTP 1.0 only handles one request per connection. After the server processes the customer’s request and receives the reply, the server disconnects from the customer. The goal is to save transmission time and improve concurrency performance.

HTTP 1.1 will wait a while, disconnect if there are no further requests, or continue using. The goal is to improve efficiency and reduce the number of times connections are established in a short period of time.

Based on TCP

The purpose of the HTTP protocol is to define the format of data transmission and the behavior of data interaction between the client and server. It is not responsible for the details of data transmission. Most of the underlying implementations are based on TCP. The version in use now has persistent connections by default, meaning that multiple HTTP requests use one TCP connection.

HTTP workflow

The client sends a request packet to the server, which contains the request method, URL, protocol version, request header, and request data. The server responds to the request, including the protocol version, success or error code, server information, response headers, and response data.

Basic steps for an HTTP request/response:

Step 1: The client (such as a browser) connects to the Web server (port 80 by default) and establishes a TCP connection.

Step 2: Initiate an HTTP request based on TCP.

Step 3: The service accepts the request and returns the corresponding message.

Step 4: Release the TCP connection.

Step 5: The client (browser) parses the HTML content and renders it;

If the URL address is a domain name, you need to request the IP address of the domain name to be resolved from the DNS server, and then establish a TCP connection based on the IP address and port number.

HTTP request packet

An HTTP request consists of four parts: the request line (request method), the request header (message header), the empty line, and the request body.

HTTP request packet example:

Host: 127.0.0.1 user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; The rv: 67.0) Accept: text/HTML, application/XHTML + XML, application/XML. Accept-Language: zh-CN,zh; The Accept - Encoding: gzip, deflate Referer: http://127.0.0.1/index.html the content-type: application/x-www-form-urlencoded Content-Length: 29 Connection: close Cookie: security=impossible; PHPSESSID = 8 vv0n11btuol45hqcm5recmfp7 Upgrade - Insecure - Requests: 1 # request body username = admin&password = adminCopy the code

Note that each line has a carriage return and a newline at the end, and an empty line between the content entity and the request header.

HTTP response packet

The HTTP response consists of four parts: the response line, the response header (message header), the blank line, and the response body (message subject).

Response packet Example:

Date: Tue, 10 Aug 2021 09:09:09 GMT //... Omit... Content-Length: 5185 Connection: close Content-Type: text/html; Charset =gb2312 PUBLIC DOCTYPE HTML "- / / / DTD/W3C XHTML 1.0 Strict / / EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" > < HTML xmlns="http://www.w3.org/1999/xhtml">Copy the code

Content-length indicates the Length of the response body.

HTTP request methods

The request method is the method that the client uses to inform the server of its action intention. The HTTP/1.1 protocol defines eight methods to operate on a specified resource. Note that the method names are case sensitive and all uppercase letters.

GET: Obtains resources

Requests for resources corresponding to the specified URI are read-only and should not have “side effects”.

HEAD: obtains the packet HEAD

The HEAD method is similar to the GET method, but the HEAD method does not require data to be returned. It is used to verify the validity of a URI and the update time of a resource. It can be regarded as metadata.

POST: transmits the entity body

The POST method is used to transfer the body of the entity.

PUT: transfers files

The PUT method is used to transfer the latest content of a file to a specified resource location.

DELETE: deletes resources

Request the server to remove the identified resource, as opposed to the PUT method.

TRACE: indicates a tracing path

The command output displays the requests received by the server for testing or diagnosis.

OPTIONS: Asks for supported methods

Gets all HTTP request methods supported by the specified resource. Use ‘*’ instead of the resource name to send an OPTIONS request to the Web server to test whether the server functions properly.

CONNECT: The tunnel protocol is required to CONNECT to the agent

Reserved in HTTP/1.1 for proxy servers that can pipe connections. Typically used for links to SSL encrypted servers (via an unencrypted HTTP proxy server).

If the corresponding Method is Not supported, status code 405 (Method Not Allowed) is returned. When a method is Not Implemented, status code 501 (Not Implemented) is returned.

The HTTP server should implement at least GET and HEAD methods, and other methods are optional. And the supported method implementation should match the semantic definition of the method.

The HTTP status code

The status code is used to inform the client of the result of the server-side processing of the request. The first line of an HTTP response is a status line (including the version number, status code, and phrase). For details, see the returned packet.

The status code contains the following types:

  • 1XX message – Received request being processed
  • 2XX success: The request is successfully processed
  • 3XX redirection – Additional action is required to complete the request
  • 4XX request error – The request contains a lexical error or cannot be executed
  • 5XX server error – The server failed to process the request

Common status codes:

  • 200: Indicates that the client request is successful.
  • 302: Redirect.
  • 404: The requested resource does not exist, the most common state.
  • 400: The client request has a syntax error and cannot be understood by the server.
  • 401: Unauthorized request.
  • 403: The server received the request but refused to provide service.
  • 500: Internal server error, the most common state.
  • 503: The server cannot process requests from the client. The server may recover after a period of time.

URL form

Uniform Resource Locator addresses for hypertext Transfer Protocol (HTTP) :

  • Transport protocol.
  • Hierarchical URL marker ([//], fixed)
  • Credential information needed to access the resource (omitted)
  • The server. (Usually domain name, or IP address)
  • The port number. (It is expressed in numbers and can be omitted if the default HTTP value is :80.)
  • The path. (Distinguish each directory name in the path with a slash character)
  • The query. Form parameter in GET mode, with “?” Each parameter is separated by “&”, and then “=” to separate the parameter name and data, usually in UTF8 URL encoding to avoid character conflicts.)

Fragment. Start with the # character

With www.choupangxia.com:80/blog/index…. As an example.

Among them:

  • HTTP is the protocol;
  • www.choupangxia.com, is the server;
  • 80 is the default network port number on the server and is not displayed by default.
  • /blog/index.html, is the path (URI: direct to the corresponding resource);
  • ? Id =10&page=1, is a query.

summary

This is the summary of HTTP knowledge, focusing on HTTP usage scenarios, request return packet formats, usage processes, and features.

About the blogger: Author of the technology book SpringBoot Inside Technology, loves to delve into technology and writes technical articles.

Public account: “program new vision”, the blogger’s public account, welcome to follow ~

Technical exchange: Please contact the weibo user at Zhuan2quan