First, after entering the url, what happened?

  1. If you enter a domain name, the first step will beDomain name resolutionFind the IP address
  2. forTCP three timesShake hands, make links
  3. The browser sends a request message, and the server returns a request for static resources such as images, CSS or Ajax data
  4. Get the response, process the response (the response data may also be directly returned by CDN server, load balancing server and other intermediate cache, but will not really reach the server), draw DOM, execute JS script, etc
  5. If the link breaks, it will be executedFour times to waveThe process of

As shown in the figure:

  • The first is a three-way handshake
  • The second and third rows are two data requests and responses, respectively
  • Because of the long connection, there is no quadruple wave

TCP three-way handshake

  • First handshake ([SYN], Seq = x):

The client initiates the link and sends the SYN flag bit along with the initial sequence number X. After sending, the client enters the SYN_SEND state

  • Second handshake ([SYN, ACK], Seq = y, ACK = x + 1):

The server sends an ACK packet and a SYN packet at the same time. ACK = x + 1 indicates that it is received, and Seq = y indicates that the client is received. After the sending is complete, the server enters the SYN_RCVD state

  • Third handshake ([ACK], ACK = y + 1):

The client sends an ACK packet again. ACK = y + 1 indicates that it has received the packet from the server. After the client sends the packet, the state is ESTABLISHED. After the server receives the packet, the state is also ESTABLISHED. The TCP handshake ends

Why not 2 handshakes? If the server replies yes, and the client does not make a final acknowledgement, the server cannot determine whether the client can receive the message. Why not 4 handshakes? The three-way handshake process already ensures that the client and server can send and receive messages to each other without wasting resources again

TCP data request/response

  • Client initiates data request (GET/HTTP/1.1):

The client initiates a request over TCP, such as a GET request, after establishing the link through a three-way handshake

  • Server responds to GET request ([ACK], Seq = x):

After receiving the request, the server sends an ACK packet to the client to indicate that the request has been received

  • Server sends request data (HTTP/1.1 200 OK):

When the server finds the URI, it returns the data to the client with a status code of 200 (or some other state)

  • Client receiving acknowledgement ([ACK], Seq = y):

After receiving specific data, the client sends an ACK packet to indicate that the data is received

TCP wave four times

  • First wave ([FIN], Seq = x):

The client sends a fin-tagged packet to tell the server to close the connection. After the packet is sent, the server enters the FIN_WAIT_1 state. In this case, the client does not send data, but can still receive data

  • Second wave ([ACK], ACK = x + 1):

The server sends an ACK acknowledgement to inform the client that it received the shutdown request. The server enters the CLOSE_WAIT state, and the client enters FIN_WAIT_2 after receiving the package. The server may still be sending data to the client

  • Third wave ([FIN], Seq = y):

After sending data to the client, the server sends a FIN flag packet to tell the client that it is ready to close. After sending data, the server enters the LAST_ACK state

  • Fourth wave ([ACK], ACK = y + 1):

After receiving the shutdown request from the server, the client sends an ACK packet and enters the TIME_WAIT state to wait for the ACK packet that may be retransmitted by the server

After receiving the ACK packet, the server closes the connection and enters the CLOSED state. After the server does not receive the retransmitted ACK packet, it also closes the connection and enters CLOSED

Why not three waves? If you close the service only after receiving the request from the client, the data of the current data request may not be sent completely, resulting in data loss. So the server to send the data after the completion of a confirmation shutdown

5. HTTP packet structure

  1. The structure of HTTP request packets and response packets is basically the same and consists of three parts:
    • Start line: Basic information about the request or response
    • Header field: indicates the detail attribute of the packet in the form of key-value
    • Message body: The actual transmitted data, possibly text, images, etc

The first two parts are often referred to together as the request or response headers (headers), while the third part is called the body

  1. According to the HTTP protocol, a packet must contain a header but may not contain a body
  2. Header must be followed by a blank line
  1. The start line is divided into request line and status line according to the request and response
  • Request line: The start line in the request packet is the request line, which consists of three parts:

Request method + request target + version number, such as GET/HTTP/1.1

  • Status line: The starting line in the response packet is called the status line and consists of three parts:

Version + Status + Cause, for example, HTTP/1.1 200 OK

  1. Header fields are key-values.

    • General field: Date can be displayed in both request and response headers: indicates the time when the packet is created

    • Host: indicates the requested Host (must be transmitted). User-agent: indicates the client that initiates the request, such as Chrome or Safari

    • Response field: only appears in the response header Server: indicates the current Web service software and version number of the Server (usually does not really return specific, vulnerable)

    • Entity field: Additional information about the body Content-Length: Indicates the Length of the body

Six, commonly used standard request method

  • GET: Obtains resources. Read or download data from the server
  • HEAD: obtains the meta information of the resource. You can reduce the size of the response message body
  • POST: submits data to the server. Equivalent to writing or uploading data (RFC suggests new operation)
  • PUT: similar to POST(RFC recommended update operation)
  • DELETE: deletes resources

7. URI composition

A URI is a uniform Resource Identifier (URI) that contains the URL and URN 2 sections. The URI format consists of scheme, host:port, Path, and Query.

  • Scheme: Indicates the protocol used by resources, such as HTTP, FTP, and mail. (The following characters must be ://, separating the first and second parts.)
  • Host :port: indicates a host name, usually in the format of a host name plus a port number. A host name can also be an IP address. The host name must be specified; otherwise, the corresponding server cannot be found. If the port number is not specified, the default port number will be used according to Scheme. For example, the default value is 80 for HTTP and 443 for HTTPS
  • Path: With the protocol name + host name +path, the browser can access the resource directly
  • Query: Indicates the query parameter. By querying parameters, you can give the server more information to help the server return more accurate data

URI encoding: In some special URIs, special characters cause URI parsing to fail, so URI encoding is introduced

Common response status codes

  • 1xx: an information indicating that the current status is in the middle and subsequent operations are required
  • 2xx: yes, the packet is received and processed correctly. 200 OK: Everything is normal. 204 No Content: Similar to 200, except that the response message does not contain body 206 Partial Content: Some data was processed successfully. Is the basis of HTTP breakpoint continuation
  • 301 Moved Permanently: Permanently redirected Permanently, indicating that the resource does not exist 302 Found: A temporary redirection, indicating that the resource is still there but needs to be accessed through another URI for the time being. 304 Not Modified: CACHE redirection, indicating that the redirection has been cached to a file
  • 400 Bad Request: Indicates that the Request packet is incorrect. The specific error is Not specified. 403 Forbidden: Indicates that the server forbids access to resources. 405 Method Not Allowed: Indicates that the Request mode is incorrect. For example, GET is used in the POST Request. 408 Request Timeout: Indicates that the Request times out
  • 5xx: Server Error. An Error occurs during Internal Server processing. 500 Internal Server Error: This is a common Server Error code. 502 Bad Geteway: The server is working properly, but an error occurs when you access the back-end server. 503 Service Unavailable: The server is busy and cannot respond to the Service for the time being

9. Features of HTTP

  • Flexible and extensible

HTTP design from the very beginning is very simple, in the continuous update, body from only TXT to the present picture, audio and so on can see his scalability

  • Reliable transport

Because HTTP is based on TCP/IP, it also inherits the characteristics of TCP’s reliability. Reliable representation: In the process of data transmission, try to ensure complete data delivery

  • Application layer protocol

HTTP is rare, application-layer based and very feature-rich

  • Request-reply mode

HTTP uses request-response communication, so both parties must first initiate a request and link, and then transfer data

  • stateless

Stateless means that the client or server does not record the current link and the next link needs to be repeated. But since HTTP is extensible, there are extensions that make links stateful (cookies, etc.)

Series directory

HTTP Learning Notes (1)

HTTP Learning Notes (2)

HTTP Learning Notes (3)

HTTP Learning Notes (4)

HTTP Learning Notes (5)

HTTP Learning Notes (6)