The HTTP protocol


The issue of

  • Interviewer Series (1): How to implement deep clone
  • Interviewer Series (2): Implementation of Event Bus
  • Interviewer Series (3): Implementation of front-end routing
  • The advantages of bidirectional binding based on Proxy data hijacking
  • Interviewer Series 5: Why do you use a front-end framework
  • Interviewer Series 6: Ever write “Common Front-end components”?
  • Talk about Babel
  • Interviewer Series 8: How do you implement the “immutable data structure” emphasized by React?

What are the methods of HTTP?

  • HTTP1.0 defines three request methods: GET, POST, and HEAD
  • HTTP1.1 adds five request methods: OPTIONS, PUT, DELETE, TRACE, and CONNECT

What exactly do these methods do?

  • GET: Usually used to request the server to send certain resources
  • HEAD: The headers of the requested resource that are returned by the HTTP GET method. One use scenario of this request method is to obtain the size of a large file before downloading it and decide whether to download it or not to save bandwidth resources
  • OPTIONS: Used to obtain the communication OPTIONS supported by the destination resource
  • POST: sends data to the server
  • PUT: Used to add a resource or replace the representation of the target resource with the payload in the request
  • DELETE: deletes a specified resource
  • PATCH: used to partially modify resources
  • CONNECT: reserved in HTTP/1.1 for proxy servers that can pipe connections
  • TRACE: displays the requests received by the server for testing or diagnosis

What’s the difference between GET and POST?

  • Different data transmission modes: GET requests transfer data through URLS, while POST requests transfer data through request bodies.
  • The security is different: The data of POST is in the request body, so it has a certain security guarantee, while the data of GET is in the URL. Through the history record, the cache can easily find the data information.
  • Different data types: GET allows only ASCII characters, while POST does not
  • GET is harmless: Refresh, back, and other browser operations on GET requests are harmless, and POST may submit the form repeatedly
  • Features are different: GET is secure and idempotent, while POST is non-secure and idempotent

PUT and POST are used to send new resources to the server. What is the difference?

The difference between PUT and POST is that the PUT method is idempotent: one or more consecutive calls have the same effect (with no side effects), while the POST method is non-idempotent.

There is another difference. Normally, a PUT URI points to a specific resource, whereas a POST can point to a collection of resources.

For example, we in the development of a blog system, when we want to create an article with the POST, https://www.jianshu.com/articles, is the request of the semantic, the articles collection of resources to create a new article, If we submit this request multiple times and create multiple articles, this is non-idempotent.

And PUT https://www.jianshu.com/articles/820357430 semantics are updated under the corresponding articles of resources (such as change the author name, etc.) this URI points to is a single resource, and it is idempotent, such as you PUT the changes into “CAI xu”, “lau. How many times have I submitted it and changed it to “CAI Xukun”?

Ps: “POST means create resource, PUT means update resource” is false, both can create resource, the fundamental difference is idempotent

PUT and PATCH are used to send modification resources to the server. What is the difference?

Both PUT and PATCH are update resources, while PATCH is used to locally update known resources.

Such as the address of the https://www.jianshu.com/articles/820357430, we have an article of this article can be expressed as:

article = {
    author: 'dxy'.creationDate: '2019-6-12'.content: 'I write like CAI Xukun'.id: 820357430
}
Copy the code

When we want to change the author of the article, we can directly to PUT https://www.jianshu.com/articles/820357430, at this time of the data should be:

{
    author:'CAI Xukun'.creationDate: '2019-6-12'.content: 'I write like CAI Xukun'.id: 820357430
}
Copy the code

This direct covering resources should be modified by the put, but every time you think with so many useless information, you can send the PATCH, https://www.jianshu.com/articles/820357430, this time only need to:

{
    author:'CAI Xukun',}Copy the code

What is an HTTP request packet?

The request packet consists of four parts:

  • The request line
  • The request header
  • A blank line
  • Request body

  • The request line contains the request method field, URL field, and HTTP protocol version field. They are separated by Spaces. For example, GET /index.html HTTP/1.1.
  • Request header: The request header consists of a keyword/value pair. The keyword and value are separated by colons (:)
  1. User-agent: indicates the type of the browser that generates the request.
  2. Accept: List of content types recognized by the client.
  3. Host: specifies the requested Host name. Multiple domain names can reside at the same IP address, that is, a virtual Host.
  • Request body: Data carried by a request, such as POST and PUT

What is the HTTP response packet?

The request packet consists of four parts:

  • Response line
  • Response headers
  • A blank line
  • Response body

  • Response line: consists of protocol version, status code, and cause phrase of status code, for exampleHTTP / 1.1 200 OK.
  • Response header: Response component
  • Response body: data that the server responds to

What are the elements of HTTP?

A lot of content, focus on the label “✨” content

General Header Fields: The Header used by both request and response packets

  • Cache-control Control Cache ✨
  • Connection Connection management, item by item header ✨
  • Upgrade Upgrade to another protocol
  • Information about the VIA proxy server
  • Wraning error and warning notifications
  • Transfor-encoding Transmission Encoding of the packet body ✨
  • View the header at the end of Trailer message
  • Pragma message instruction
  • Date Date when the packet is created

Reauest Header Fields: the Header used by the client to send request packets to the server

  • Accept The media types that the client or agent can process ✨
  • Accept-encoding Indicates the Encoding format that can be processed preferentially
  • Accept-language Preferred natural Language that can be processed
  • Accept-charset Specifies the character set that can be processed first
  • If-match Compares entity tags (ETage) ✨
  • If-none-match compares entity tags (ETage) to if-match ✨
  • If-modified-since Compares the resource update time (last-modified) ✨
  • If-unmodified-since Compares the resource update time (last-modified) as opposed to if-modified-since ✨
  • If-rnages send range requests for entity Byte when the resource is not updated
  • Range Requests the byte Range of the entity ✨
  • Authorization Web authentication information ✨
  • Proxy-authorization Proxy servers require Web authentication information
  • Host Server that requests resources ✨
  • From Email address of the user
  • User-agent Client program information ✨
  • Max-forwrads Indicates the maximum number of hops
  • Priority of TE transmission encoding
  • Referer requests the URL originally placed
  • Expect expects specific behavior from the server

Response Header Fields: Fields used in the Response from the server to the client

  • Accept-ranges Specifies the range of bytes that are acceptable
  • Age Calculates the elapsed time of resource creation
  • Location Specifies the URI ✨ that the client is redirected to
  • Vary proxy server cache information
  • ETag can represent a string ✨ that is unique to a resource
  • Www-authenticate The server requests authentication information from the client
  • Proxy-authenticate The Proxy server requests authentication information from the client
  • Server Server information ✨
  • Retry-after The header field used with status code 503 indicates the time when the server is requested next

Entiy Header Fields: The Header is used for the entity part of the request and response messages

  • Allow Resources can support HTTP request methods ✨
  • Content-language Specifies the resource Language of the entity
  • Content-encoding Specifies the Encoding format of the entity
  • Content-length Specifies the size of the entity in bytes.
  • Content-type Indicates the Type of the entity media
  • Content-md5 Indicates the digest of the entity packet
  • Content-location replaces the RESOURCE’s YRI
  • Content-rnages The position of the entity body is returned
  • Last-modified resource Last-modified resource ✨
  • Expires An expired resource for an entity body ✨

What are the HTTP status codes?

2 xx success

  • 200 OK: indicates that the request from the client is correctly processed on the server ✨
  • A 201 Created request has been implemented, and a new resource has been Created based on the request
  • 202 Accepted request Accepted but not executed, no guarantee of completion
  • 204 No content: Indicates that the request is successful, but the response packet does not contain the body of the entity
  • 206 Partial Content, make scope requests ✨

3 xx redirection

  • 301 Moved permanently, permanently redirects: indicates that the resource has been assigned a new URL
  • 302 Found, temporary redirection, indicates that the resource has been temporarily assigned a new URL ✨
  • 303 See Other: indicates that another URL exists for the resource. Use GET to obtain the resource
  • 304 Not Modified: indicates that the server allows access to the resource but the request condition is not met
  • 307 Temporary redirect Is the same as 302

4XX Client error

  • 400 Bad Request, syntax errors exist in the request packet ✨
  • 401 Unauthorized: The request to be sent requires authentication information authenticated through HTTP ✨
  • 403 Forbidden: Access to requested resources is denied by the server ✨
  • 404 not found: The requested resource was not found on the server ✨
  • 408 Request timeout: The client Request times out
  • 409 Confict, requested resources may cause conflicts

5XX Server error

  • 500 Internal sever error: an error occurred when the server executed the request ✨
  • 501 Not Implemented Requests are beyond the scope of the server. For example, the server does Not support a function required for the current request or the request is a method Not supported by the server
  • 503 Service Unavailable: The server is temporarily overloaded or is being stopped for maintenance and cannot process requests
  • 505 HTTP Version Not Supported The server does not support or rejects the HTTP version used in the request

Is the same redirection 307,303,302 difference?

302 is the protocol status code of HTTP1.0. In http1.1, there are two additional 303 and 307 to refine the 302 status code.

303 explicitly states that the client should use the GET method to obtain the resource, and that it will turn the POST request into a GET request for redirection. 307 will comply with browser standards and will not change from POST to GET.

What does HTTP keep-alive do?

In the early DAYS of HTTP/1.0, a connection was created for each HTTP request, and the process of creating a connection required resources and time. To reduce resource consumption and shorten response times, you needed to reuse connections. In later HTTP/1.0 and HTTP/1.1, the mechanism of reuse Connection was introduced, that is, Connection: keep-alive was added in the HTTP request header to tell the other party not to close the request after the response is completed, and we will use this request to continue communication next time. The protocol states that HTTP/1.0 requires Connection: keep-alive in the request header if you want to keep a long Connection.

Advantages of Keep-Alive:

  • Less CPU and memory usage (due to fewer connections open at the same time)
  • Allows HTTP pipelining of requests and replies
  • Reduced congestion control (TCP connections reduced)
  • Reduced latency for subsequent requests (no handshake required)
  • You do not need to close the TCP connection when reporting an error

Why do you need HTTPS when you have HTTP?

HTTPS is a secure version of HTTP. HTTP data is transmitted in plain text, so it is not secure for the transmission of sensitive information. HTTPS is born to solve HTTP insecurity.

How is HTTPS secure?

It’s a complicated process, and we need to understand two concepts

Symmetric encryption: that is, both sides of communication use the same secret key for encryption and decryption. For example, the secret code of the spy joint is symmetric encryption

Symmetric encryption is simple and performs well, but it doesn’t solve the problem of sending a secret key to someone for the first time, making it easy for hackers to intercept the secret key.

Asymmetric encryption:

  1. Private key + Public key = Key pair
  2. Data encrypted with the private key can be decrypted only by the corresponding public key, and data encrypted with the public key can be decrypted only by the corresponding private key
  3. Each communication party has its own key pair, and sends its public key to the other party before communication
  4. The other party then uses the public key to encrypt the data and responds to the other party. When it arrives at the other party, the other party uses its private key to decrypt the data

Asymmetric encryption is more secure, but the problem is that it is slow and affects performance.

Solution:

Then, the two encryption methods are combined. The symmetric encryption key is encrypted with the asymmetric encryption public key, and then sent. The receiver uses the private key to decrypt the symmetric encryption key.

This brings up another problem, the middleman problem:

If there is a middleman between the client and server, the middleman only needs to replace the public key of the communication between the two sides with his own public key, so that the middleman can easily decrypt all the data sent by the two sides.

Therefore, a secure third-party issued certificate (CA) is required to prove the identity of the identity and prevent man-in-the-middle attack.

A certificate includes the issuer, certificate purpose, user public key, user private key, HASH algorithm, and certificate expiration time

But the question is, if the middleman tampered with the certificate, would the proof of identity be invalid? The proof was bought for nothing, and a new technology was needed, digital signatures.

A digital signature is a digital signature that uses the HASH algorithm of the CA to HASH out a digest of the certificate, encrypts the certificate with the CA private key, and finally forms a digital signature.

When someone sends his certificate, I use the same Hash algorithm to generate the message digest again, and then decrypt the digital signature with the CA’s public key to get the message digest created by the CA. By comparing the two, I can know whether the middle has been tampered with.

At this time can maximize the security of communication.

What are the advantages and features of HTTP2 over HTTP1.x?

Binary framing

Frame: The smallest unit of HTTP/2 data communication message: refers to the logical HTTP message in HTTP/2. Such as request and response, messages consist of one or more frames.

Flow: A virtual channel that exists in a connection. Streams can carry two-way messages, and each stream has a unique integer ID

HTTP/2 transmits data in binary format rather than HTTP 1.x’s text format, and binary protocols parse more efficiently.

Server push

The server can actively push other resources while sending the PAGE HTML, rather than waiting for the browser to parse to the appropriate location, initiate a request and then respond. For example, the server can actively push JS and CSS files to the client without having to send these requests while the client parses the HTML.

The server can actively push, the client also has the right to choose whether to receive. If the server pushes a resource that has already been cached by the browser, the browser can reject it by sending RST_STREAM. Active push also complies with the same origin policy. The server does not randomly push third-party resources to the client.

The head of compression

HTTP/1.x imposes an additional burden on the network by repeatedly carrying infrequently changing, verbose header data in requests and responses.

  • HTTP/2 uses “header tables” on both the client and server sides to track and store previously sent key-value pairs, rather than sending the same data with each request and response
  • The header table exists throughout the lifetime of the HTTP/2 connection and is gradually updated by both the client and the server.
  • Each new head key-value pair is either appended to the end of the current table or replaces the previous value in the table.

You can think of it as sending only differential data, not all of it, to reduce the amount of information in the header

multiplexing

In HTTP 1.x, multiple TCP links must be used if multiple requests are to be made concurrently, and the browser limits the number of TCP connection requests for a single domain name to 6-8 in order to control resources.

HTTP2:

  • All communication with domain names is done on a single connection.
  • A single connection can host any number of two-way data streams.
  • The data stream is sent as a message, which in turn consists of one or more frames that can be sent out of order because they can be reassembled according to the stream identifier at the head of the frame

Read more: HTTP/2 features and how they perform in practice


todo

  • What is the whole process of HTTPS?
  • What is the CACHING strategy for HTTP? To be continued.

Reference:

  • The illustration of HTTP
  • The Definitive GUIDE to HTTP