Welcome to my blog

The application layer

Communication activities that provide services to users

Common protocols: HTTP, Websocket, DNS, SMTP

The HTTP protocol

Features: Simple, fast, flexible, connectionless (once the connection will break), stateless (cannot distinguish the identity of the two sides of the connection) a request process: domain name resolution -> initiate TCP three-way handshake -> send HTTP request -> server response to HTTP request -> browser resolution response The reason HTTP is stateless is because it has no memory for transactions. The absence of state means that if the previous information is needed for subsequent processing. So you need to store information through a Cookie or session over HTTP1.0.

  1. The default isConnection: keep-aliveDoes not actively close the connection
  2. Pipelining, which allows clients to send multiple requests simultaneously over the same TCP connection, was introduced

However, the http1 version of the protocol shortcomings are also obvious: too much header content, multiple headers are repeated, in the security of the plaintext transmission,

Advantages of http2 over http1:

  1. It’s more secure. It’s all binary transmission which is a frame
  2. Multiple frames of a message can be sent out of order for faster transmission
  3. Multiplexing, which can host multiple two-way data streams over a SINGLE TCP link

Disadvantages: Http2 only establishes a TCP connection once, and waits for retransmission if packet loss occurs on the connection

Advantages of HTTP3 over HTTP2:

  1. Based on UDP links, there will be no retransmission because of packet loss

The HTTP message

Format of request message:

GET /index.html HTTP/1.1 Host: hacker.jp Connection: keep-alive Content-Type: applicaiton/x-ww-form-urlencoded Content-Length: 16 name=ueno&age=37Copy the code

Format of response message:

HTTP/1.1 200 VERY OK Date: Tue, 10 Jul 2012 06:50:15 GMT Content-Length: 355 Content-Type: text/ HTML < HTML ></ HTML >Copy the code

The difference between get and POST:

  1. The send data for get is placed after the URL
  2. Post has a body in it, and the parameters are inside the body
GET Requests to access resources identified by the URI. POST Transfer entity body PUT Transfer file DELETE DELETE file HEAD Obtain the header of the packet. OPTIONS Query Resource support methods specified for the request URI TRACE TRACE path CONNECT The LINK agent LINK of the tunnel protocol is required to establish the connection between the resource and UNLINKCopy the code

Common status codes:

Status code The name of the state explain
200 ok The request succeeded. Typically used for GET and POST requests
201 Created A new resource is requested and created
301 Moved Permanently Permanently move. The requested resource has been permanently moved to the new URI, the return message will include the new URI, and the browser will automatically redirect to the new URI. Any future requests should be replaced with a new URI, that is, the URI specified by the Location field
302 Found Temporary move. Similar to 301. But resources are moved only temporarily. The client should continue to use the original URI
303 See Other Look at other addresses. Similar to 301. Regardless of the method of the original request, the method of redirecting the request is always GET, okay
304 Not Modified Unmodified. Negotiation cache hit returns 304. The requested resource is not modified, and the server does not return any resources when it returns this status code. Clients typically cache accessed resources by providing a header indicating that the client wants to return only resources that have been modified after a specified date
307 Temporary Redirect 307 is essentially the same as 302, except that the 307 status code does not allow the browser to redirect a POST request to a GET request
308 permanent Redirect Permanent redirect, 308 is essentially the same as 301, except that the 308 status code does not allow the browser to redirect a POST request to a GET request.
400 Bad Request Client request syntax error, server cannot understand
403 Forbidden The server understands the request from the requesting client, but refuses to execute the request
404 Not Found The server could not find the resource (web page) based on the client’s request. With this code, a web designer can set up a personalized page that says “the resource you requested could not be found.
500 Internal Server Error The server had an internal error and could not complete the request

Principle of cache

Let’s not forget the HTTP status code, but the browser caching mechanism

  • Browsers load resources based on the request headerexpiresandcache-controlCheck whether the strong cache is matched. If yes, resources are directly read from the cache without sending requests to the server.
  • If the strong cache is not hit, the browser must send a request to the server to passlast-modifiedandetagVerify that the resource matches the negotiated cache. If so, the server returns the request, but does not return data for the resource, and still reads the resource from the cache
  • If neither hits, load the resource directly from the server

Also explain the following expires, cache-Control,etag, last-Modified. These fields:

# Expires Expires is a header proposed by HTTP1.0 that represents the expiration time of a resource. It describes an absolute time that is returned by the server. # cache-control Cache-control is used in HTTP / 1.1. Expires is limited to local time. Changing local time can cause a Cache to expire. Priority over Expires, which represents the relative time # last-modified, if-modified-since last-modified, which represents the Last modification date of the local file, The browser will append the request header to if-modified-since and ask the server If the resource has been updated Since that date. # ETag, if-none-match ETag is like a fingerprint. Every change in the resource causes the ETag to change regardless of when it was last modified. If-none-match headers will send the last returned ETag to the server, asking If the resource's ETag has been updated. If the resource's ETag has changed, it will send a new resource back. The browser first makes a cache expiration judgment. The browser determines whether the cache file expires based on the cache expiration time. Scenario 1: If there is no expiration date, we do not send the request to the server and use the cached result directly. In this case, we can see 200 OK(from cache) in the browser console. In this case, the browser and the server do not interact with each other. Scenario 2: If it has expired, a request will be sent to the server. At this time, the request will carry the file modification time and Etag set in step 1 and then judge the resource update. According to the file modification time sent by the browser, the server determines whether the file has not been modified since the browser's last request. According to Etag, determine whether the file content has changed since the last request. If both tests conclude that the file has Not been Modified, the server simply tells the browser that the file has Not been Modified. Use your cache, 304 Not Modified, and the browser will retrieve the index.html from its local cache. In this case, called protocol caching, there is a request interaction between the browser and the server. Case 4: If either the modification time or the file content fails, the server processes the request. The following operations are the same: (1) Only GET requests are cached, but POST requests are notCopy the code

HTTP optimization:

  1. Leverage load balancing to optimize and accelerate HTTP applications
  2. Optimize your web site with HTTP Cache

On the HTTPS

Symmetric encryption algorithm Data encryption + Asymmetric encryption algorithm Exchange key + digital certificate authentication Symmetric encryption data: The sent packets are encrypted using SSL, and the receiver decrypts the packets using SSL sockets. The following describes the HTTPS encryption process:Here is the SSL process for HTTPS:

HTTPS communication mode

The client sends a request to the server. The server sends a digital certificate to the client. The client authenticates the digital certificate and successfully generates a session key. The client side can create a box that can then be opened using the server 🔑 or its own public key. 2. The client confirms that the server is the one it is looking for and that the identity is correct: The box refers to the encoding algorithm (random symmetric key) that encrypts and decrypts data. The server 🔑 is the server keyCopy the code

HTTPS SSL process

(1) The customer accesses the Web server using an HTTPS URL and requires an SSL connection to the Web server. (2) After receiving the request from the client, the Web server generates a pair of public and private keys and sends the public key in a certificate to the browser of the client. (3) The client determines whether the certificate is valid. If it is invalid, a warning will pop up and a random value will be generated. Encrypts the random value with the public key of the certificate. The encrypted random value is called the key and sent to the server. (4) The Web server decrypts the key with the private key. The key is then used to encrypt the data. (5) The client decrypts data with a keyCopy the code

Why is HTTPS reliable

Why is HTTPS reliable?

HTTPS is reliable because it solves three problems.

1, encrypted communication, even if others get the information can not be restored to the original information.

2. Prevent man-in-the-middle attacks. A hacker cannot impersonate a server. Because the server gives the client a CA certificate.

  • If the client authenticates the certificate, the certificate and public key are authentic and sent by the server.
  • If the client cannot verify the certificate, it is not reliable and may be a fake.

3. Credibility of CA certificate. A company or an individual needs to apply for a CA certificate and verify the existence of the domain name through email authentication. It avoids the possibility of hackers impersonating websites to obtain certificates.

Technically, the CA’s public key is written in the operating system or browser. Only content encrypted with the CA’s private key can be decrypted. This ensures that the certificate is indeed issued by the CA.

About the DNS resolution process

The browser searches for its OWN DNS cache – “queries wins servers -” for broadcast lookup – “reads the host file

DNS Query Process

The requesting host first queries the target domain name from the local DNS server. Check whether the local cache has the IP address list of the target domain name. Otherwise, the local DNS server does not have the IP address list of the target domain name (that is, the target domain name is on the server). Otherwise, the local DNS server obtains the IP address list of the TLD(suffix Then the local DNS server sends a query packet to the TLD DNS server to obtain the IP address of the authoritative DNS server where the domain name resides, query the AUTHORITATIVE DNS server, and obtain the response packet of the IP addressCopy the code

About SMTP

SMTP is a mailbox access protocol. It is a push protocol that can only perform the following operations:

  1. Transfer a message from the sender’s mail server to the receiver’s mail server
  2. Sends messages from the sender’s user agent to the sender’s mail server

So when the receiver’s user agent wants to get mail from the receiver’s mail server, the following protocol is used: POP3 (version 3 post Office Protocol) or IMAP (Internet Mail Access Protocol) or HTTP

The websocket protocol

HTTP protocol communication can only be initiated by the client, so there is a Websocket p2p mode. The WebSocket protocol is a software interface for sending and receiving packets to and from the network. The data format is relatively light, with low performance overhead and high communication efficiency. 2. You can send text or binary data. 3. The client can communicate with any server without source restriction. 4. The protocol identifier is WS (or WSS if encrypted), and the server URL is the URL.Copy the code