Origin of HTTP
HTTP was initiated by Tim Berners-Lee at CERN in 1989
The most famous of these is RFC 2616, released in June 1999, which defines HTTP 1.1, a version of the HTTP protocol that is widely used today
What is the HTTP
Full Name: HyperText Transfer Protocol
Concept: HTTP is a communication protocol for accessing network resources such as HTML and images. It is the basis of data exchange on the Web, and is a client-server protocol
HTTP – the Multimedia Messenger of the Internet – the authoritative guide to HTTP. HTTP’s role on the Internet: as a messenger, doing the legwork of passing information between the client and the server, but we can’t go without it. HTTP is an application-layer protocol that is closely related to front-end development. HTTP requests, HTTP caches, Cookies, cross-domain, and so on are all related to HTTP
Basic features of HTTP
-
Extensible agreement. The introduction of HTTP Headers in HTTP 1.0 made it easier to extend protocols. As long as the server and client agree on the semantics of headers, new features can be added easily
-
HTTP is stateless and session-based. There is no relationship between two successfully executed HTTP requests on the same connection. This raises the question of users have no way of continuous interaction in the same site, and in an electricity network operators RACES, for example, users to add a product to the shopping cart, added the goods again after switch a page, there is no connection between the request twice to add this, what finally chose the browser cannot know customer goods. HTTP Cookies solve this problem by using HTTP header extensions. Add Cookies to the header and create a session so that each request shares the same contextual information and achieves the same state.
-
HTTP and connections. Sent over TCP, or TLS – an encrypted TCP connection, any reliable transport protocol can theoretically be used. Connections are controlled by the transport layer, which is fundamentally not the domain of HTTP.
That is, HTTP relies on connection-oriented TCP for messaging, but a connection is not required. It just needs to be reliable, or it doesn’t lose messages (or at least return errors).
By default, HTTP/1.0 opens a separate TCP connection for each PAIR of HTTP requests/responses. When multiple requests need to be initiated consecutively, this pattern is less efficient than multiple requests sharing the same TCP link. To this end, HTTP 1.1 uses the concept of persistent connections, where the underlying TCP connection can be implemented through the Connection header. But HTTP 1.1 is also imperfect in connection, as we’ll see later.
Component system based on HTTP
The component system of HTTP includes a client, a Web server, and a proxy
Client: user-agent
Browsers, especially programs used by engineers, and Web developers debugging applications
The Web server
The Web Server serves and serves the documents requested by the client. For every request sent to the server, the server processes it and returns a message, which is called response
Proxies
There are many computers and other devices that forward HTTP messages between the browser and the server. They can occur at the transport layer, network layer, and physical layer and are transparent to the HTTP application layer
There are several functions as follows
- The cache
- Filtering (like antivirus scanning, parental control)
- Load balancing
- Authentication (permissions on different resources)
- Log management
HTTP Packet Composition
HTTP has two types of messages:
- Request – sent by a client to trigger an action on a server
- Response – Reply from the server side
HTTP messages consist of multiple lines of asciI-encoded text. In HTTP/1.1 and earlier, these messages were sent publicly over the connection. In HTTP2.0, messages are divided into multiple HTTP frames. HTTP messages are provided through configuration files (for proxy servers or servers), apis (for browsers), or other interfaces
Typical HTTP session
- Establish a connection
In the client-server protocol, the connection is initiated by the client. Opening a connection in HTTP means starting the connection at the underlying transport layer, usually TCP. When using TCP, the default port number for the HTTP server is 80, and 8000 and 8080 are also commonly used
-
Send the client request
-
Server responds to request
HTTP request and response
HTTP requests and responses include the start line, HTTP Headers, empty line, and body, as shown in the figure below:
- Starting line.
Start line of request: request method, request Path, and HTTP version. Start line of response: HTTP version, response status code, and status text description
Request Path (Path) :
1) An absolute path followed by a ‘? ‘and the query string. This is the most common form, called the origin form, used by the GET, POST, HEAD, and OPTIONS methods
POST / HTTP / 1.1GET /background. PNG HTTP/1.0 HEAD /test.html? Query =alibaba HTTP/1.1 OPTIONS /anypage.html HTTP/1.0Copy the code
2) A complete URL. This is primarily used when connecting to an agent using the GET method
GET http://developer.mozilla.org/en-US/docs/Web/HTTP/Messages HTTP / 1.1
Copy the code
3) The authority Component of the URL consisting of the domain name and optional port (prefixed by ‘:’) is called the Authority Form. This parameter is used only when an HTTP tunnel is established using CONNECT
CONNECT developer.mozilla.org:80 HTTP / 1.1
Copy the code
4) Asterisk form, a simple asterisk (‘*’), used in conjunction with OPTIONS, to represent the entire server.
OPTIONS * HTTP / 1.1
Copy the code
Headers
Request headers or response headers. See the header below.
A case-insensitive string followed by a colon (‘:’) and a structure depends on the value of the header
-
A blank line. A lot of people overlook it
-
Body
Request Body: Some requests send data to the server to update the data: The most common is a POST request (which contains HTML form data). The Body of a request packet generally has two types. One is a single file body defined by content-type and content-length. The other is composed of multiple bodies, usually associated with HTML forms. The difference is in the value of the content-type.
1) Content-type — Application/X-www-form-urlencoded form Content, application/ X-www-form-urlencoded form Content, have the following characteristics:
I. The data is encoded as key-value pairs separated by &
II. Characters are encoded in URL encoding mode.
/ / conversion process: {a: 1, b: 2} - > a = 1 & b = 2 - > as follows (final) "% 3 d1%26 b % 3 d2"Copy the code
2) the content-type – multipart/form – the data
The content-type field in the request header contains boundary, and the value of boundary is specified by the browser default. Example: the content-type: multipart/form – data; A boundary = – WebkitFormBoundaryRRJKeWfHPGrS4LKe.
Data is divided into multiple parts. Each part is separated by a delimiter. Each part has an HTTP header description subpackage body, such as content-type.
Content-Disposition: form-data; name="data1";Content-Type: text/plain
data1
----WebkitFormBoundaryRRJKeWfHPGrS4LKe
Content-Disposition: form-data; name="data2";Content-Type: text/plain
data2
----WebkitFormBoundaryRRJKeWfHPGrS4LKe--
Copy the code
Response Body:
1) Consists of a single file of known length. The body Type is defined by two headers: content-type and Content-length
2) Consists of a single file of unknown length, using chunks Encoding by setting transfer-encoding to chunked.
Content-length is mentioned below in HTTP 1.0. This is a very important new header in HTTP 1.0.
methods
Secure methods: HTTP defines a set of methods called secure methods. Both GET and HEAD methods are considered safe, which means that neither of them will produce any action — the HTTP request will not produce any result on the server. But that doesn’t mean that nothing happens, but it’s more up to the web developer to decide, right
GET
: Requests the server to send a resourceHEAD
: withGET
Method is similar, but the server only returns the header in the response. The body of the entity is not returned.PUT
: Writes a document to the server. Semantics: The body of the request is used to create a requestURL
Name the new documentPOST
: used to input data to the server. Usually we submit form data to the server. 【POST
Used to send data to the server,PUT
Method to store data to a resource (such as a file) on the server.TRACE
: Mainly used for diagnosis. Implement message loopback along the path to the target resource (loop-back
) test, provides a practicaldebug
Mechanism.OPTIONS
Request:WEB
The server informs the various functions it supports. You can ask the server which methods are supported. Or which methods are supported for particular resources.DELETE
: Requests the server to delete the requestURL
The resource specified in the
Difference between GET and POST
The first thing to understand is the concept of side effects and idempotences. Side effects are changes to server-side resources. Idempotent refers to the sending of M and N requests (both different and greater than 1) with the same state of resources on the server. In application scenarios, GET is side-effect-free and idempotent. Post is mainly a side-effect, non-idempotent case
Technically, there are the following distinctions:
- Cache:
Get
Requests can be cached,Post
The request cannot be - Security:
Get
He was notPost
Requests are so secure because they’re all thereURL
In the. And the browser will save the history.POST
In the body of the request, which is more secure - Limitations:
URL
There’s a length limit, there’s interferenceGet
Request, this is determined by the browser - Code:
GET
Request can only be madeURL
Code, only receiveASCII
Characters, andPOST
There is no limit.POST
More encoding types are supported, and there are no restrictions on data types - from
TCP
The point of view,GET
The request sends the request packet at once, andPOST
There will be twoTCP
Packets, send firstheader
Part if the server responds100(continue)
And then sendbody
Part. (Except for Firefox, which hasPOST
Send only one requestTCP
Package)
Status code
-
100 to 199 — Informational status code
101 Switching separate Protocols. When HTTP is upgraded to WebSocket, if the server agrees to the change, it will send status code 101.
-
200 to 299 — Success status code
200 OK: indicates that the request from the client was processed correctly on the server
204 No content: The request is successful, but the response message does not contain the body of the entity
205 Reset Content: indicates that the request succeeds, but the response message does not contain the body of the entity. Different from 204 response, however, it requires the requester to Reset the Content
206 Partial Content, making scope request
-
300 to 399: Indicates the redirection status code
301 Moved, permanently redirected, indicating that resources have been allocated to new urls
302 found, temporary redirection, indicating that the resource was temporarily assigned a new URL
303 see other: indicates that the resource has another URL. Use the GET method to obtain the resource
304 Not modified: Indicates that the server allowed access to the resource, but the request condition was not met
Temporary redirect is similar to 302, but the client is expected to keep the request method unchanged and send requests to new addresses
-
400 to 499 — Client error status code
400 Bad Request: the request packet has syntax errors
401 Unauthorized: requests are sent with HTTP authentication information
403 Forbidden: indicates that the server denies access to the requested resource
404 Not found: The requested resource was not found on the server
-
500~599 — Server error status code
500 Internal sever Error: indicates that an error occurs when the server executes the request
501 Not Implemented: The server does Not support a function required by the current request
503 Service Unavailable: Indicates that the server is temporarily overloaded or down for maintenance and cannot process requests
The first
HTTP Headers
1. General Headers A message header that applies to both request and response messages but has nothing to do with the data transmitted in the final message body. Such as the Date
2. Request Headers Contains more information about the resource to be obtained or about the client itself. Such as the user-agent
3. Response headers Contain supplementary information about the Response
Entity Headers contains more information about the Entity body, such as the content-length degree or ITS MIME type. If the Accept – Ranges
See the HTTP Headers collection for details on Headers
The past and present life of HTTP
HyperText Transfer Protocol (HTTP) is the basic Protocol of the World Wide Web (World Wide Web). Dr Tim Berners-Lee and his team created it between 1989 and 1991. [HTTP, Web browser, server]
HTTP 0.9 was released in 1991, 1.0 in 1996, and 1.1 in 1997, which is by far the most widely transmitted version. In 2015, version 2.0 was released, which greatly improved the performance and security of HTTP/1.1, while version 3.0, released in 2018, continued to optimize HTTP/2, aggressively replacing TCP with UDP, and currently, HTTP/3 will be supported by Chrome, Firefox, and Cloudflare on September 26, 2019
The HTTP 0.9
One-line protocol. Requests consist of one-line instructions. Start with the only available method, GET. This is followed by the path to the target resource
GET /mypage.html
Copy the code
Response: Only the response document itself
<HTML>This is a very simple HTML page</HTML>
Copy the code
- No response header, just transmission
HTML
file - No status code
The HTTP 1.0
RFC 1945 introduced HTTP1.0 to build better scalability
- Protocol version information is sent with each request
- Response status code
- The introduction of the
HTTP
The concept of headers, whether requests or extensions, allows metadata to be transferred. Make the agreement more flexible and extensible Content-Type
Request header that has the transfer apart from plain textHTML
The ability to document other types than files
In the response, the Content-Type header tells the client the Content Type of the actual returned Content
Media type is a standard. The nature and format used to represent a document, file, or byte stream. Browsers usually use the MIME (Multipurpose Internet Mail Extensions) type to determine how to handle urls, so it is important that the Web server configure the correct MIME type in the response header. If the configuration is incorrect, the website may not work properly. MIME components are very simple; Consists of two strings of type and subtype separated by a ‘/’.
HTTP takes part of the MIME type to mark the data type of the body part of the message. These types are represented in the Content-Type field. Of course, this is for the sender, and the receiver can also use the Accept field to receive a specific type of data.
The values of the two fields can be classified into the following categories:
-text: text/ HTML, text/plain, text/ CSS, etc. -Image: image/ GIF, image/ JPEG, image/ PNG, etc. -Audio /video: Audio /mpeg, video/ MP4 etc. - Application: application/json, application/javascript, application/pdf, application/octet-streamCopy the code
At the same time, in order to agree the compression mode, supported language and character set of request data and response data, the following headers are also proposed
1. Compression mode: Sender: Content-encoding (how the server informs the client that the server encodes the body of the entity) and receiver: accept-encoding (the Encoding supported by the user agent). Values include Gzip: the most popular compression format. Deflate: Another well-known compression format; Br: A compression algorithm invented specifically for HTTP
2. Supported languages: Content-Language and Accept-Language (set of natural languages supported by the user agent)
3. Character set: Sender: Content-type, which is specified as charset. Receiver: Accept-charset (the character set supported by the user agent).
// Content-encoding: gzip Content-language: zh-cn, zh, en content-type: text/ HTML; // Content-encoding: gzip; Charset = UTF-8 // Accept-encoding: gzip accept-language: zh-cn, zh, en Accept-charset: charset= UTF-8Copy the code
Although HTTP1.0 is a significant improvement over HTTP 0.9, it still has many disadvantages
The main drawback of HTTP/1.0 is that only one request can be sent per TCP connection. When the data is sent, the connection closes, and if additional resources are requested, a new connection must be created. TCP connections are expensive to create because they require a three-way handshake between the client and the server and are sent at a slow start rate.
The earliest model of HTTP, and the default model for HTTP/1.0, was short connections. Each HTTP request is made by its own separate connection; This means that each HTTP request is preceded by a TCP handshake, which is continuous.
The HTTP 1.1
HTTP/1.1 was released as RFC 2068 in January 1997.
HTTP 1.1 removes a lot of ambiguity and introduces a number of technologies
-
Connections can be reused. Long connection: connection: keep-alive. HTTP1.1 supports PersistentConnection. PersistentConnection enables multiple HTTP requests and responses to be transmitted over a SINGLE TCP Connection. This reduces the cost and latency of establishing and closing connections. Keep-alive, to some extent, makes up for the drawback of HTTP1.0 having to create a connection on every request.
-
Added HTTP Pipelinling to reduce communication latency by allowing a second request to be sent before the first reply has been fully sent. During the multiplexing of the same TCP connection, even if multiple requests are sent simultaneously through the pipe, the server will respond in the order of the requests. The client blocks subsequent requests (queuing) until it receives a response to all previous requests. This is called “head-of-line blocking”.
-
Supports response block Transfer and block Encoding: transfer-encoding: chunked
Content-length Specifies the data length of the response. Keep-alive connections can transmit multiple responses in succession, so content-length is used to identify which response the packet belongs to. The prerequisite for using the Content-Length field is that the server must know the data Length of the response before sending it. For some time-consuming dynamic operations, this means that the server can’t send data until all operations are complete, which is obviously inefficient. A better approach would be to send a block of data as it is generated, using “Stream mode” instead of “Buffer mode.” Therefore, HTTP 1.1 states that you can use “Chunked Transfer Encoding” instead of content-length fields. Any request or response header with a transfer-encoding: chunked field indicates that the body will probably consist of an undetermined number of data blocks. Each block is preceded by a row containing a hexadecimal value that represents the length of the block; The last block of size 0 indicates that the data of this response has been sent.
- Introduce additional cache control mechanisms. in
HTTP1.0
Main use inheader
In theIf-Modified-Since
.Expires
And so on as the criterion for cache judgment,HTTP1.1
More cache control policies are introduced for exampleEntity tag
.If-None-Match
.Cache-Control
And more alternative cache headers to control the cache policy. Host
Head. Configure the same domain name for different domainsIP
Address of the server.Host
是The HTTP 1.1
Protocol in a new request header, mainly used to implement virtual host technology.
Virtual Hosting is also known as shared Web Hosting, which uses virtual technology to divide a complete server into several hosts, so that multiple websites or services can be run on a single host.
For example, there is a server with an IP address of 61.135.169.125 on which the websites of Google, Baidu and Taobao are deployed. Why is it that when we visit https://www.google.com, we see Google’s home page instead of Baidu’s or Taobao’s home page? The reason is that the Host request header determines which virtual Host to access.
The HTTP 2.0
In 2015, HTTP2.0 came out. rfc7540
HTTP/2
It’s a binary protocol, not a text protocol. Let’s start with a few concepts:- Frame: Client and server communicate by exchanging frames, the minimum unit of communication based on this new protocol.
- Message: A logical HTTP message, such as a request or response, that consists of one or more frames.
- Stream: A stream is a virtual channel in a connection that can carry messages in both directions; Each stream has a unique integer identifier
Frames in HTTP 2.0 split HTTP/1.x messages into frames and embed them in a stream. Data frames and header frames are separated, which allows header compression. Combining multiple streams, a process known as multiplexing, allows for more efficient underlying TCP connections.
That is, a stream is used to hold a message, which in turn consists of one or more frames. Binary transmission improves the transmission performance. Each data stream is sent as a message, which in turn consists of one or more frames. A frame is a unit of data in a stream.
HTTP frames are now transparent to Web developers. In HTTP/2, this is an additional step between HTTP/1.1 and the underlying transport protocol. Web developers do not need to make any changes in the API they use to take advantage of HTTP frames; When both the browser and the server are available, HTTP/2 is turned on and used.
- This is a reuse protocol. Parallel requests can be processed in the same connection, removed
HTTP/1.x
In order and blocking constraints. Multiplexing allows simultaneous passage of a singleHTTP/2
The connection initiates multiple request-response messages
As mentioned earlier, although HTTP 1.1 has long connections and pipelining, there is still queue blocking. HTTP 2.0 solves this problem. The new binary frame-splitting layer in HTTP/2 breaks through these limitations and enables complete request and response reuse: clients and servers can split HTTP messages into independent frames, interlace them, and reassemble them at the other end.
As shown in the figure above, snapshots capture multiple data flows in parallel within the same connection. The client is sending a DATA frame (stream 5) to the server, while the server is sending an interlaced series of frames (stream 1 and stream 3) to the client. Thus, there are three parallel data streams on a connection.
Breaking HTTP messages into individual frames, interleaving them, and then reassembling them at the other end is one of the most important enhancements to HTTP 2. In fact, this mechanism causes a chain reaction throughout the network technology stack, resulting in huge performance gains that allow us to: 1. Multiple requests are sent in parallel, interleaved, and do not affect each other. 2. Multiple responses are sent in parallel and interleaved. The responses do not interfere with each other. 3. Send multiple requests and responses in parallel using a single connection. 4. Reduce page load times by eliminating unnecessary delays and improving utilization of existing network capacity. 5. No more work to bypass HTTP/1.x restrictions (such as sprites)…
Connection sharing, that is, each request is used as a connection sharing mechanism. A request corresponds to an ID. In this way, a connection can have multiple requests, and the requests of each connection can be randomly mixed together. The receiver can assign the request to different server requests according to the ID of the request.
For a comparison between HTTP 1.1 and HTTP 2.0, see this website demo
The HTTP 1.1
Here is a demonstration:
HTTP2.0 is shown below:
- The compression
headers
.HTTP1.x
的header
With a lot of information, and repeatedly sent each time, there is a performance cost. To reduce this overhead and improve performance,HTTP/2
useHPACK
The compression format compresses request and response header metadata using two simple but powerful techniques:
This format enables the encoding of the transmitted header fields through static Huffman code, thereby reducing the size of the individual transfers. This format requires both the client and server to maintain and update an indexed list of previously seen header fields (in other words, it establishes a shared compression context), which is then used as a reference to effectively encode previously transmitted values.
- Server push. It allows the server to populate the client cache with data that is requested in advance through a mechanism called server push. The server does not need the client to explicitly request the resource. The server can push the necessary resource to the client in advance, which can reduce the request delay time. For example, the server can proactively push the resource
JS
和CSS
The file is pushed to the client instead of waitingHTML
The request is sent when the resource is resolved, which can reduce the delay time, as shown in the figure below:
How to upgrade your VERSION of HTTP
Using HTTP/1.1 and HTTP/2 is transparent to sites and applications. It is enough to have an up-to-date server to interact with a new browser. Only a small percentage of the population needed to change, and as older browsers and servers were updated, the number of users increased without any need for Web developers to do anything
HTTPS
HTTPS also transmits information over HTTP, but uses THE TLS protocol for encryption
Symmetric encryption and asymmetric encryption
Symmetric encryption is when both sides have the same secret key, and both sides know how to encrypt and decrypt the ciphertext. However, because the transmission of data is on the network, if the secret key is passed through the network, once the secret key is intercepted, there is no sense of encryption
Asymmetric encryption
Public key As you all know, you can encrypt data with a public key. However, the private key must be used to decrypt the data, and the private key is held by the issuer of the public key. First the server publishes the public key, so the client knows the public key. Then the client creates a secret key, encrypts it with the public key, and sends it to the server. After receiving the ciphertext, the server uses the private key to decrypt the correct secret key
TLS Handshake Procedure
The TLS handshake process uses asymmetric encryption
Client Hello
: The client sends a random value (Random1
) and the required protocols and encryption methods.Server Hello
As well asCertificate
: The server receives a random value from the client and generates its own random value (Random2
), use the protocol and encryption mode required by the client, and send your own certificate (if you need to verify the client certificate, please specify)Certificate Verify
: The client receives the certificate from the server and verifies whether it is valid. If it passes the verification, it will generate a random value (Random3
), encrypts the random value and sends it to the server using the public key of the server certificate, and if the server needs to verify the client certificate, it will attach the certificateServer to generate the secret
: The server receives the encrypted random value and decrypts it with the private key to get the third random value (Random3
), at this time, both ends have three random values, which can be used to generate a key according to the previously agreed encryption method, and the following communication can be encrypted and decrypted through the key
HTTP cache
Strong cache
Strong caching is mainly determined by the cache-Control and Expires headers
The Expires value and the Date attribute in the header determine whether the cache is valid. Expires is a Web server response header field that tells the browser in response to an HTTP request that data can be cached directly from the browser until the expiration date, without having to request it again. One disadvantage of Expires is that the time returned is the server’s time, which is an absolute time. This can be problematic if the client’s time is significantly different from the server’s (such as out of sync with the clock, or across time zones).
Cache-control specifies the validity period of the current resource and controls whether the browser can directly Cache data from the browser or resend requests to the server for data. But it’s set at a relative time.
Specify the expiration time: max-age is the number of seconds after the request is initiated. For example, the strong cache can be hit within 31536000S after the request is initiated
Cache-Control: max-age=31536000
Copy the code
There is no cache
Cache-Control: no-store
Copy the code
There is a cache but you have to reverify it
Cache-Control: no-cache
Copy the code
Private and public caches
Public means that the response can be cached by any middleman (such as an intermediary agent, CDN, etc.) while private means that the response is dedicated to a single user, the middleman cannot cache the response, and the response can only be applied to the browser’s private cache.
Cache-Control: private
Cache-Control: public
Copy the code
Authentication mode: Once the resource has expired (for example, it has exceeded max-age), the cache cannot use the resource to respond to subsequent requests before it is successfully authenticated to the original server
Cache-Control: must-revalidate
Copy the code
The cache-control priority is higher than Expires
Here is a cache-control strong Cache process:
- First request, directly from the server. It will set
max-age=100
- On the second request,
age=10
If the value is less than 100, the target is hitCache
, directly return - On the third request,
age=110
Theta is greater than 110. The strong cache is invalid, and you need to request it againServer
Negotiate the cache
The if-modified-since - last-modified
Last-modified indicates the Last time the local file was Modified. The browser asks the server whether the resource has been updated Since that date by adding if-modified-since to the request header. New resources will be sent back if there are any updates
However, if you open the cached file locally, the last-modified file will be changed, so the ETag appears in HTTP / 1.1
If none - match - ETags
An Etag is like a fingerprint. Changes in the resource lead to changes in the Etag, regardless of the last modification time. An Etag ensures that each resource is unique. The if-none-match header sends the last returned Etag to the server, asks If the resource’s Etag has been updated, and sends back a new resource If it has changed
If-none-match and ETags have higher priorities than if-modified-since and last-modified
First request:
Second request for the same page:
Negotiate cache, return 304 if no change, return 200 if change
- 200: strong cache
Expires/Cache-Control
When invalid, a new resource file is returned - 200
(from cache)
Strong: slowExpires/Cache-Control
They both exist, they’re not expired,Cache-Control
Give priority toExpires
Is displayed, the browser succeeds in obtaining resources from the local server - 304
(Not Modified)
: Negotiation cacheLast-modified/Etag
If it has not expired, the server returns status code 304
Today, 200(from cache) has been converted into disk cache and memory cache
Revving technology
I mentioned HTTP caching above, but a lot of times, we want to update online resources once we’re online.
Web developers have invented a technique that Steve Souders calls Revving. Infrequently updated files are named in a particular way: a version number is appended after the URL (usually the file name).
Disadvantages: When the version number is updated, the version number of all the references to these resources will change
Web developers often use automated build tools to do this in the real world. When the low frequency source (JS/CSS) changes, only the high frequency source (HTML) changes the entry.
Cookies
An HTTP Cookie (also called a Web Cookie or browser Cookie) is a small piece of data sent by the server to the user’s browser and stored locally. It will be carried and sent to the server the next time the browser makes a request to the same server.
Create a cookie
Set-cookie Response header and Cookie request header
Set-Cookie: <cookie name >=<cookie value >Copy the code
Phase session cookies
A session-period Cookie is the simplest Cookie: it is automatically deleted after the browser closes, meaning it is only valid for the duration of the session. Session period Cookies do not need to specify an expiration time (Expires) or an expiration time (max-age). It is important to note that some browsers provide session recovery, in which case the session cookies are retained even after the browser is closed, as if the browser was never closed
A persistent Cookie
Unlike session-time cookies, which expire when the browser is closed, persistent cookies can be specified with a specific expiration time (Expires) or expiration time (max-age).
Set-Cookie: id=a3fWa; Expires=Wed, 21 Oct 2015 07:28:00 GMT;
Copy the code
Secure and HttpOnly tags for cookies
Cookies marked Secure should only be sent to the server through requests that are encrypted by the HTTPS protocol.
Cookies marked Secure should only be sent to the server through requests that are encrypted by the HTTPS protocol. However, even if the Secure flag is set, sensitive information should not be transmitted through cookies because cookies are inherently insecure and the Secure flag does not provide a real security guarantee
Cookies with the HttpOnly tag are not accessible through JavaScript’s Document.cookie API. This is done to avoid cross-domain scripting attacks (XSS)
Set-Cookie: id=a3fWa; Expires=Wed, 21 Oct 2015 07:28:00 GMT; Secure; HttpOnly
Copy the code
The scope of a Cookie
The Domain and Path identities define the scope of the Cookie: the URL to which the Cookie should be sent.
The Domain id specifies which hosts can accept cookies. If this parameter is not specified, the current host (excluding the subdomain name) is used by default. If Domain is specified, subdomains are usually included.
For example, if Domain=mozilla.org is set, then the Cookie is also included in the subdomain (such as developer.mozilla.org).
The Path identifier specifies which paths in the host can accept cookies (the URL Path must exist in the request URL). With the character %x2F (“/”) as the path separator, subpaths are also matched.
For example, if you set Path=/docs, the following addresses will match:
/docs
/docs/Web/
/docs/Web/HTTP
Copy the code
SameSite Cookies
SameSite cookies prevent cross-site request forgery attacks by allowing the server to request that a Cookie not be sent on a cross-site request
None
The browser will continue to send under the same site request, cross-site requestcookies
, case insensitive. The old versionchrome
The defaultChrome 80
Before versionStrict
The browser will only send it when visiting the same sitecookie
.Lax
Will be reserved for some cross-site sub-requests such as image loading orframes
But only when the user navigates to it from an external siteURL
Is sent. Such aslink
link
Set-Cookie: key=value; SameSite=Strict
Copy the code
None Strict Lax
In newer versions of the browser (After Chrome 80), the default property for SameSite is SameSite=Lax. In other words, when the Cookie has no SameSite property set, it will be treated as if the SameSite property is set to Lax — which means that Cookies will not be sent automatically while the current user is using them. If you want to specify that Cookies are sent on both site and cross-site requests, you need to explicitly specify SameSite as None. Because of this, we need to check if the old system explicitly specifies SameSite, and recommend that the new system explicitly specifies SameSite to be compatible with the old and new versions of Chrome
For more information about cookies, check out the cookie knowledge summary I summarized earlier in an article about cookies
HTTP Access Control (CORS)
Cross-domain resource sharing (CORS) is a mechanism that uses additional HTTP headers to tell browsers that web applications running on an Origin (domain) are allowed to access specified resources from different source servers
The cross-domain resource sharing standard adds a new set of HTTP header fields that allow servers to declare which sources have access to which resources through the browser.
A simple request
Simple requests (precheck requests that do not trigger CORS) need to meet the following three requirements:
-
The method is one of GET/HEAD/POST
-
The value of content-type is only one of text/plain, Multipart /form-data, application/ X-wwW-form-urlencoded
-
The HTTP header cannot exceed the following fields: Accept, Accept-language, content-language
Content-type (note additional limitations) DPR, Downlink, save-data, viewport-width, Width
The following are the request packets and response packets of a simple request
Simplify as follows:
The request header field Origin indicates that the request originated from http://foo.example
In this example, the access-control-allow-origin: * returned by the server indicates that the resource can be accessed from any foreign domain. If the server only allows access from http://foo.example, the header field reads as follows:
Access-Control-Allow-Origin: http://foo.example
Copy the code
Access-control-allow-origin should be * or contain the domain name specified by the Origin header field.
Preview the request
The specification requires methods for HTTP requests that may have side effects on server data. The browser must first make a Preflight request using the OPTIONS method to know whether the server will allow the cross-domain request.
After the server acknowledges permission, the actual HTTP request is made. In the return of the precheck request, the server can also notify the client whether it needs to carry credentials (including Cookies and HTTP authentication-related data)
The precheck request carries the following two header fields:
Access-Control-Request-Method: POST
Access-Control-Request-Headers: X-PINGOTHER, Content-Type
Copy the code
The header field access-control-request-method tells the server that the actual Request will use the POST Method. Header field Access-control-Request-headers Tells the server that the actual Request will carry two custom Request header fields: X-Pingother and Content-Type. Based on this, the server determines whether the actual request is allowed.
The following fields are included in the response to the precheck request
Access-Control-Allow-Origin: http://foo.example // indicates that the server allows the client to initiate requests using the POST, GET, and OPTIONS methodsAccess-Control-Allow-Methods: POST, GET, OPTIONS // Indicates that the server allows the request to carry fields X-Pingother and Content-typeAccess-Control-Allow-Headers: X-pingother, content-type // indicates that the response is valid for 86,400 seconds, that is, 24 hours. Within the validity period, the browser does not need to initiate a precheck request for the same request again.Access-Control-Max-Age: 86400
Copy the code
In general, browsers do not send credentials information for cross-domain XMLHttpRequest or Fetch requests. To send credential information, you need to set a special flag bit of the XMLHttpRequest. For example, if the withCredentials flag of XMLHttpRequest is set to true, you can send cookies to the server.
The server cannot set the value of access-control-allow-Origin to * for requests with identity credentials. This is because the header of the request carries Cookie information. If the access-control-allow-Origin value is “*”, the request will fail. Set the value of access-control-allow-origin to http://foo.example and the request will execute successfully.
CORS involves the following request and response headers: HTTP response header field
Access-Control-Allow-Origin
Allows access to the external domain of the resourceURI
. For requests that do not need to carry credentials, the server can specify the value of this field as a wildcard, indicating that requests from all domains are allowed.Access-Control-Expose-Headers
Headers allow the server to whitelist headers that are accessible to the browserAccess-Control-Max-Age
Specifies the headpreflight
How long can the results of the request be cachedAccess-Control-Allow-Credentials
Header specifies when the browsercredentials
Set totrue
Whether to allow the browser to readresponse
The content of the.Access-Control-Allow-Methods
The header field is used to precheck the response to the request. It indicates what the actual request allowsHTTP
Methods.Access-Control-Allow-Headers
The header field is used to precheck the response to the request. This specifies the header field allowed in the actual request.
HTTP request header field
Origin
The header field indicates the source of the precheck request or the actual requestAccess-Control-Request-Method
The header field is used to precheck the request. It tells the server which HTTP method was used for the actual request.Access-Control-Request-Headers
The header field is used to precheck the request. The function is to tell the server the header field carried by the actual request.
reference
- MDN
- The development of HTTP
- HTTP overview
- Introduction to the HTTP / 2
- Cache (2) — Browser cache mechanism: strong cache, negotiated cache
- (Intensive reading recommended) HTTP Soul question, strengthen your knowledge of HTTP