Understand the HTTP protocol

1 Process of accessing web pages

2 HTTP concept

HTTP is a protocol for retrieving web resources such as HTML. It is the basis for data exchange on the Web.

  • It is a client-server protocol
  • Based on TCP/IP communication protocol to transfer data
  • An object-oriented protocol that belongs to the application layer

3 the HTTP history

  • HTTP/0.9 – Single-line protocol

HTTP at this point is very simple: only GET methods are supported; No head; You can only get plain text.

  • HTTP/1.0 – Framework the protocol

In 1996, HTTP was officially published as a standard, version HTTP/1.0. Version 1.0 added header, status code, permission, cache, long connection (default short connection) and other specifications, can be said to build the basic framework of the protocol.

  • HTTP/1.1 – Further improvements

Version 1.1 followed in 1997. The major improvement in version 1.1 is the default long connection; Force the client to provide a Host header; Pipelines; Cache-control, ETag, and other Cache extensions.

4 the TCP/IP protocol

4.1 CONCEPTS of TCP/IP

The general name of the protocol group that must be used when using IP to communicate. Specifically, IP or ICMP, TCP or UDP, TELNET or FTP, and HTTP are all TCP/IP protocols.

4.2 the TCP/IP model

4.3 Data Transmission

In each layer, the data to be sent is appended with a header that contains information necessary for the layer, such as the destination address to be sent and protocol-related information. Typically, the information supplied to the protocol is the packet header, and the content to be sent is data. From the point of view of the next layer, all packets received from the previous layer are considered data of this layer. When packets go down from the application layer, they undergo encapsulation, while when they go up from the link layer, they undergo unencapsulation.

4.3.1 Application processing

First, the application does the coding (equivalent to the OSI presentation layer); After encoding transformation, mail will not be sent immediately and will wait for communication connection to be established (equivalent to OSI session layer).

4.3.2 Processing the TCP Module

TCP is responsible for establishing connections, sending data, and disconnecting connections according to the instructions of the application. TCP provides reliable transmission of data sent from the application layer to the peer end. In order to achieve this function, it is necessary to attach a TCP header to the front-end of application layer data.

4.3.3 IP Module processing

IP TCP TCP header and TCP data together as their own data, and in the TCP header front end with their OWN IP header; After IP packets are generated, appropriate internetwork routing and switching nodes are selected to transmit IP packets.

4.3.4 Processing of Network Interfaces (Ethernet-driven) (Data link Layer + Physical Layer)

Receive IP packets from IP and attach the Ethernet head, so as to generate Ethernet packets and transmit them to the receiving end through the physical layer;

4.3.5 Processing network Interfaces (Ethernet-driven) (Data Link Layer + Physical Layer)

After receiving an Ethernet packet, the host first finds the MAC address from the Ethernet packet header to check whether the packet is sent to the host. If not, the host discards the packet. If the packet is sent to itself, the data type is determined from the type in the Ethernet packet header and then transmitted to the corresponding module, such as IP and ARP.

4.3.6 IP Module Processing

After receiving the data, the IP module determines whether the IP address in the packet header matches its own IP address. If yes, the IP module sends the data (the packet after the protocol header is removed or the process of decapsulation) to the corresponding modules, such as TCP and UDP, according to the protocol type of the header. In the case of a router, the address of the receiving end is usually not its own address. In this case, the routing control table is used to obtain the host or router that should be sent to and then forward the data.

4.3.7 Processing the TCP Module

In the TCP module, it first calculates the checksum to determine whether the data is damaged, then checks whether the data is received according to the serial number, reorganizes the data according to the serial number, and finally checks the port number to determine the specific application program. After the data is received in its entirety, it is passed to the application identified by the port number;

4.3.8 Processing of applications

The receiver application program will directly receive the data sent by the sender, and display the corresponding content by parsing the data.

4.4 TCP Header

  • The source port and destination port, each containing 2 bytes, are written to the source port and destination port respectively.
  • Sequence Number, consisting of four bytes. Each byte in the byte stream transmitted over a TCP connection is numbered sequentially. For example, if the serial number field of a packet is 301 and the data carried has 100 fields, it is obvious that the serial number of the next packet segment (if any) should start from 401.
  • This Acknowledgment Number, which is four bytes, is the Number of the first byte of data expected to be received in the next packet from the peer. For example, B receives the packet from A, whose serial number field is 501 and the data length is 200 bytes. This indicates that B correctly receives the data from A up to the serial number 700. Therefore, B expects the next data number from A to be 701, so B sets the confirmation number to 701 in the confirmation message segment sent to A.
  • Data offset, consisting of 4 bits, indicates how far the DATA of the TCP packet is from the start of the TCP packet segment.
  • Reserved, accounting for 6 bits, reserved for future use, but should be 0 at present;
  • Emergency URG, when URG=1, indicates that the emergency pointer field is valid. Tell the system that there is urgent data in this message segment.
  • Acknowledgment ACK(Acknowledgment), the Acknowledgment number field is valid only when ACK=1. According to TCP, ACK must be set to 1 for all packets transmitted after the connection is established.
  • Push PSH(Push) : When two application processes communicate interactively, sometimes the application process at one end wants to receive the response immediately after entering a command. In this case, PSH=1;
  • Reset RST(Reset). If RST is 1, a serious error occurs in the TCP connection and the connection must be released and re-established.
  • Syncing SYN, used to synchronize serial numbers when a connection is established. If SYN=1, ACK=0, it indicates a connection request packet. If SYN=1, ACK=1 is accepted, the response packet should be SYN=1, ACK=1.
  • Terminates the FIN to release the connection. If the FIN value is 1, the sender of the packet has finished sending data and wants to release the data.

The window, which is two bytes long, tells the receiver how much space you need to send the message to receive it;

  • Checksum, 2 bytes, checkhead and data;
  • Emergency pointer, accounting for 2 bytes, indicating the number of bytes of emergency data in this paragraph;
  • Option, variable length, defines some other optional arguments.

4.5 Three handshakes

4.5.1 Three-way handshake Process

  1. TCP server process first create transmission control block TCB, ready to accept the client process connection request, at this time the server enters the LISTEN state;
  2. The TCP client process first creates TCB and then sends a connection request packet to the server. SYN=1, ACK=0 in the packet header and seq= X are selected. At this time, the TCP client process enters the SYN-sent state. According to TCP, the SYN segment (SYN=1) cannot carry data, but must consume a sequence number.
  3. After receiving the request packet, the TCP server sends an acknowledgement packet if it agrees to the connection. In the acknowledgement packet, ACK=1, SYN=1, ACK= X +1, and seq= Y are initialized. Then, the TCP server process enters the SYN-RCVD state. This message also does not carry data, but again consumes a serial number.
  4. The TCP client process also sends an acknowledgement to the server after receiving the acknowledgement. Confirm the SYN=0, ACK=1, ACK= y+1, and seq= X +1 of the packet. In this case, the TCP connection is ESTABLISHED and the client enters the ESTABLISHED state. According to TCP, AN ACK packet segment can carry data, but does not consume serial numbers if it does not.
  5. After receiving the confirmation from the client, the server enters the ESTABLISHED state. After that, the two sides can communicate.
4.5.2 Why three-way handshake
  • Confirm the sending and receiving capability of both parties

Before establishing a TCP connection, ensure that the client and server can receive and send packets.

  1. First handshake: The client sends a network packet and the server receives it. In this way, the server can conclude that the sending capability of the client and the receiving capability of the server are normal.

  2. Second handshake: The server sends the packet and the client receives it. In this way, the client can conclude that the receiving and sending capabilities of the server and the client are normal. However, the server cannot confirm whether the client’s reception capability is normal.

  3. Third handshake: The client sends the packet and the server receives it. In this way, the server can conclude that the receiving and sending capabilities of the client are normal, and the sending and receiving capabilities of the server are also normal.

Therefore, only three handshakes can confirm the normal receiving and sending capabilities of both parties.

  • Reliable serial number synchronization

In the case of two handshakes, the server cannot determine whether the client has received the initial sequence number it sent. If the second handshake packet is lost, the client cannot know the initial sequence number of the server, and TCP reliability is out of the question.

  • Prevents initialization of repeated history connections

The client sends two SYN packets with different serial numbers for some reason. We know that the network environment is complex, and the old packets may reach the server first. In the case of a two-handshake, the server will immediately establish a connection after receiving the old SYN, causing a network exception. In the case of three-way handshake, the server replies with a SYN+ACK packet. The client compares the sequence number of the reply packet and sends an RST packet to the server if the packet is an old one. The connection is established only after a normal SYN packet arrives. So the three-way handshake has enough context information to determine whether the current connection is historical.

  • Security issues

We know that when TCP creates a new connection, the kernel allocates a set of memory resources for the connection. If the connection is established with two handshakes, that will amplify the DDOS attack. TCP as a reliable transmission control protocol, its core idea: to ensure reliable data transmission, but also to improve the efficiency of transmission, and the three-way handshake can meet the above two requirements!

4.6 Wave four times

4.6.1 Four-wave process

  1. The client process sends a connection release packet and stops sending data. Release the header of the data packet, FIN=1, whose sequence number is SEq = U (equal to the sequence number of the last byte of the previously transmitted data plus 1). At this point, the client enters the fin-WaIT-1 state. According to TCP, FIN packets consume a sequence number even if they do not carry data.
  2. After receiving the connection release packet, the server sends an acknowledgement packet with ACK=1, ACK= U +1 and its serial number seq= V. In this case, the server enters close-wait state. The TCP server notifies the higher-level application process that the client is released from the direction of the server. This state is half-closed, that is, the client has no data to send, but if the server sends data, the client still accepts it. This state also lasts for a period of time, i.e. the duration of the close-wait state.
  3. After receiving the acknowledgement request from the server, the client enters the fin-WaIT-2 state and waits for the server to send a connection release packet (before receiving the final data from the server).
  4. After sending the LAST data, the server sends a connection release packet with FIN=1 and ACK = U +1 to the client. The server is probably in the semi-closed state. Assume that the serial number is SEQ = W, then the server enters the last-ACK state and waits for the client’s confirmation.
  5. After receiving the connection release packet from the server, the client sends ACK=1, ACK= W +1 and its serial number is SEq = U +1. In this case, the client enters the time-wait state. Note that the TCP connection is not released at this time, and the client can enter the CLOSED state only after 2∗ *∗MSL (maximum packet segment life) and the corresponding TCB is revoked.
  6. The server enters the CLOSED state immediately after receiving an acknowledgement from the client. Similarly, revoking the TCB terminates the TCP connection. As you can see, the server ends the TCP connection earlier than the client.
4.6.2 Why Does it Take Four Times to Close the Connection

In the TCP handshake, the receiving end combines an ACK and a SYN packet into one packet. Therefore, the sending of one ACK packet is reduced and the handshake is completed three times.

Because TCP is full-duplex communication, after the active closing party sends the FIN packet, the receiving end may send data. The data channel from the server to the client cannot be closed immediately. Therefore, the FIN packet on the server cannot be combined with the ACK packet on the client and sent. Then, the server sends FIN packets when no data is needed. Therefore, the four wave attempts must be four packet interactions.

4.6.3 why does TIME_WAIT wait state return to CLOSE after 2MSL
  • MSL refers to the maximum lifetime of packets on the network. After the client sends an ACK packet to the FIN on the server, the ACK packet may be unreachable. If the server does not receive an ACK packet, it needs to resend the ACK packet. Therefore, after the client sends an ACK, it needs to wait for 2MSL of time (the ACK reaches the server and the server sends a FIN retransmission packet, once and again) to confirm that the server receives the ACK packet. In other words, if the client waits for 2MSL and does not receive the FIN packet from the server, the server can confirm that it has received the ACK packet from the client.
  • Avoid confusion between old and new connections. After the client sends the last ACK message segment, all the packets generated during the duration of this connection disappear from the network after 2MSL time, so that the old connection request packets will not appear in the next new connection.

5 DNS

5.1 concept

The Domain Name System (DNS), as a distributed database for mapping Domain names and IP addresses, enables users to access the Internet more conveniently without having to remember IP strings that can be read directly by machines. The process of obtaining the IP address corresponding to the host name is called domain name resolution.

5.2 Parsing Process

1. Once a request is initiated, the first thing the browser will do is to resolve the domain name. Generally, the browser will first check the hosts file on the local disk to see if there is a rule corresponding to the domain name.

2. If the corresponding IP address cannot be found in the local hosts file, the browser sends a DNS request to the local DNS server. The local DNS server is usually provided by your network access server provider, such as China Telecom and China Mobile.

3. Query the DNS request of the url you entered after it reaches the local DNS server, the local DNS server will query its cache record first. If this record exists in the cache, the local DNS server can directly return the result. If no, the local DNS server queries the DNS root server.

4. The root DNS server does not record the mapping between domain names and IP addresses. Instead, the root DNS server tells the local DNS server that you can go to the domain server to query the IP address of the domain server.

5. The local DNS server continues to make requests to the domain server, in this case the.com domain server. After receiving the request, the.com domain server does not directly return the mapping between the domain name and IP address. Instead, it tells the local DNS server the address of the resolution server for your domain name.

6, in the end, the local DNS server to the domain name resolution server request, then will be able to receive a corresponding relationship between domain name and IP address, not only to the local DNS server IP address is returned to the user’s computer, and stored in the cache, the relation between arrange another time with other user query, can return the result directly, to speed up the network access.

5.3 Parsing Methods

5.3.1 Recursive Query

Concept: THE client sends a request only once and asks the peer to provide the final result as shown in the preceding figure. DNS clients generally perform recursive query to the local domain name server.

5.3.2 Iterative Query

Concept: A client makes a request, and if the other party is not authorized to answer, it returns a list of other name servers that can answer the query. The client makes further requests from the returned list until it finds the name server that is ultimately responsible for the domain name being queried and gets the final result from it. In the preceding figure, the local DNS server performs iterative query to each DNS server.

2. Familiar with HTTP structure and communication principle

1 Main features of HTTP

1. Simple and quick
  • When a client requests a service from a server, it simply passes the request method and path.
  • Each method specifies a different type of contact between the client and the server.
  • Because HTTP protocol is simple, the HTTP server program size is small, so the communication speed is very fast.
2. The flexible
  • HTTP allows the transfer of data objects of any type.
  • The Type being transferred is marked by content-Type.
3. No connection
  • Connectionless means to limit processing to one request per connection.
  • The server disconnects from the customer after processing the request and receiving the reply from the customer.
  • In this way, transmission time can be saved.
4. A stateless
  • HTTP is a stateless protocol.
  • Stateless means that the protocol has no memory for transaction processing. The lack of state means that if the previous information is needed for subsequent processing, it must be retransmitted, which can result in an increase in the amount of data transferred per connection.
  • It responds faster when the server doesn’t need the previous information.
5. Supports B/S and C/S modes

2 HTTP message

2.1 Packet Structure

2.1.1 Request Packets

An HTTP request packet consists of four parts: request line, header, blank line, and request body

2.1.2 Response Packets

An HTTP response packet consists of four parts: response line, response header, blank line, and response body

2.2 Packet Header Classification

2.2.1 Common Headers

A header used by both the request and response packets.

2.2.2 Request Header

The packet header used to send request packets from the client to the server contains additional information about the request, client information, and priority of the response content.

2.2.3 Response Header

The header used to send a response packet from the server to the client supplements the additional content in the response and requires the client to attach additional content information.

2.2.4 Entity Header

For the headers used in the entity parts of request and response messages, entity-related information such as update time of resource content is added.

3 URI

  • Uniform Resource Identifier (URI) : uniquely identifies an Internet Resource.
  • URL (Uniform Resource Locator)
  • URN (Uniform Resource Name)

Uris are abstract definitions that, no matter how they are represented, locate a resource. These include: URL (location by address) and URN (location by name). For example, if you go to your village to find a specific person (URI), if you use the address: The number of houses in a village is the OWNER of the URL, if you use id number + name to find is URN.

4 HTTP request method

4.1 power etc.

For the same system, using the same conditions, one request and repeated multiple requests have the same impact on system resources.

4.2 Request Method

1. GET

A GET request displays the resource specified by the request and is only used to read data. Idempotent method.

2. POST

A POST request submits data to a specified resource, which is processed by the request server and included in the request body. Non-idempotent, the request may create a new resource or modify an existing resource.

3. PUT

A PUT request uploads its latest complete content to the specified resource location. Idempotent method. Very little.

4. DELETE

The DELETE request is used to ask the server to DELETE the resource identified by the requested URI. Idempotent method. Very little.

5. PATCH

PATCH requests are similar to PUT requests and are used to update resources. PATCH is generally used for partial resource updates, while PUT is generally used for overall resource updates.

6. HEAD

The HEAD method, like the GET method, makes a request to the server for a specified resource. However, the server does not pass back the response body when responding to a HEAD request. This way, we can get the response header information from the server without transferring the entire content. The HEAD method is often used by clients to view server performance.

7. OPTIONS

OPTIONS requests are similar to HEAD and are used by clients to check server performance. This method will ask the server to return all HTTP request methods supported by the resource. This method will replace the resource name with ‘*’ and send an OPTIONS request to the server to test whether the server is functioning properly. When the XMLHttpRequest object of JavaScript performs CORS cross-domain resource sharing, it uses the OPTIONS method to send a sniffing request to determine whether there is access to the specified resource.

8. CONNECT

The CONNECT method is reserved for HTTP/1.1 and enables the connection to be piped to a proxy server. Typically used for communication between SSL encrypted server links and unencrypted HTTP proxy servers. Very little.

9. TARCE

The TRACE request server displays the received request information. This method is mainly used to test or diagnose HTTP requests. Very little.

4.3 Differences between GET and POST

4.3.1 Online answers
  • GET is harmless when the browser falls back, while POST resubmits the request
  • The URL generated by GET can be bookmarked, but POST cannot
  • GET requests are actively cached by browsers, whereas POST is not, unless set manually
  • GET requests can only be url encoded, while POST supports multiple encoding methods
  • GET request parameters are retained in browser history, while parameters in POST are not
  • GET requests pass parameters in the URL with length limits, whereas POST does not
  • GET accepts only ASCII characters for the data type of the argument, while POST has no restrictions
  • GET is less secure than POST because parameters are exposed directly to the URL and therefore cannot be used to pass sensitive information
  • The GET argument is passed through the URL, and the POST is placed in the Request body
4.3.2 Your own answers
  • Different uses: GET requests resources, and POST transfers resources to the server
  • The idempotent is different: GET is idempotent, while POST is non-idempotent
  • Parameters are passed differently: The parameters of GET are placed in the URL and have length limits; The POST argument is placed in the body and has no length limitation
  • Security is different: The GET parameter is directly exposed to the URL, while POST is relatively secure

5 HTTP status code

5.1 the Status: 1 xx

Indicates that the request has been received and is being processed

  • 100 (Continue) The requester should continue to make the request. The server returns this code to indicate that it has received the first part of the request and is waiting for the rest
  • 101 (Switching protocol) The requester has asked the server to switch protocol, and the server has confirmed and is ready to switch

5.2 the Status: 2 xx

Success: the request is received successfully

  • 200 (Success) The server has successfully processed the request
  • 204 (No content) The server successfully processed the request, but did not return anything (such as an Options request)
  • The server successfully processed some GET requests (e.g., request header Range, breakpoint continuation)

5.3 the Status: 3 xx

Redirect. Further operations must be performed to complete the request

  • 301 Permanent redirection: Used when the requested URL has been removed. The location header of the response should contain the current URL of the resource
  • 302 Temporary redirection: Similar to permanent redirection, the client applies location to temporarily locate the resource with a URL. Future requests remain at the same URL
  • 304 (Unmodified) The requested page has not been modified since the last request. When the server returns this response, the web page content is not returned

5.4 the Status: 4 xx

Client error

  • 400 (Error request) The request has a syntax error
  • The 401 (unauthorized) request requires authentication
  • 403 (Forbidden) The server rejects the request
  • 404 (Not found) The server could not find the requested page

5.5 the Status: 5 xx

Server error

  • 500 (Server internal error) The server encountered an error and could not complete the request
  • 502 (Error Gateway) server, acting as gateway or proxy, received invalid response from upstream server

  • 503 (Service unavailable) The server is currently unavailable (due to overloading or downtime for maintenance). Usually, this is a temporary state
  • 504 (Gateway Timeout) The server acted as a gateway or proxy, but did not receive the request from the upstream server in time

6 HTTP status Management

6.1 Cookie + Session Status Management

  • Session is a data structure stored on the server to track user status. This data can be stored in clusters, databases, and files
  • Cookie is a mechanism for the client to save user information, which is used to record some user information and also a way to realize the Session

HTTP is a stateless protocol, so when the server needs to record the user status, it needs to use some mechanism to identify the specific user, which is Session. In a typical scenario like a shopping cart, when you click the order button, HTTP is stateless, so you don’t know which user is doing it, so the server creates a special Session for a particular user, identifies that user, and keeps track of that user, so it knows how many books are in the shopping cart. This Session is stored on the server and has a unique identifier. There are many ways to save sessions on the server, including memory, databases, and files.

How does the server identify a particular client? This is where Cookie comes in. Each TIME an HTTP request is made, the client sends a Cookie to the server. In fact, most applications use cookies to realize Session tracking. When a Session is created for the first time, the server tells the client in the HTTP protocol that it needs to record a Session ID in the Cookie, and sends this Session ID to the server for each subsequent request. I knew who you were. Someone asked, what if cookies are disabled on the client’s browser? In this case, session tracking is typically done using a technique called URL rewriting, where each HTTP interaction is followed by a parameter such as SID = XXXXX that the server identifies the user.

6.2 Token Status Management

6.2.1 CONCEPTS of JWT(Json Web Tokens
  • Json Web Tokens is an open standard
  • Defines a compact and independent way to securely transfer information as JSON objects
  • This information can be verified and trusted because it is digitally signed

JWT can often be called a Json token. The information stored in JWT is digitally signed so that it can be trusted and understood.

6.2.2 JWT composition

It consists of Header, Payload, and signature

6.2.3 Verification Process

Token authentication is stateless. The server does not record which users have logged in or which JWT have been published. Instead, each request is carried with tokens that the server needs to verify, which are put in the Authorization header in the form of Bearer {JWT}, But you can also send it in the Post body, or even as a Query parameter.

Verification process:

  1. The user enters the login information
  2. The server determines that the login information is correct and returns a token
  3. Tokens are stored on clients, mostly in local storage, but can also be stored in session storage or cookies.
  4. Then put the token into the Authorization header when making the request, or you can do the same.
  5. The server side decodes the JWT and then validates the token, and if the token is valid, processes the request.
  6. Once the user logs out, the token is destroyed on the client side without going through the server side.

6.3 Advantages of Token Mode

1. A stateless

The biggest advantage of using tokens instead of cookies should be stateless. There is no need for the back end to keep a record of tokens, each token is independent, contains all the data to check its validity, and communicates user information through claims. The only work on the server side is signing and validating tokens.

2. Cross-domain and CORS

Cookies work well with single and subdomains, but become difficult to handle when dealing with cross-domain problems. However, CORS using token can deal with cross-domain problems well. Since the JWT needs to be checked every time a request is sent to the back end, the request can be processed as soon as it is validated.

3. Store data in JWT

When using cookies for authentication, you are storing the session ID in the cookie. JWT allows you to store any type of metadata, as long as it is valid JSON. You can add any kind of data to it, either just the user ID and expiration date, or you can add other things like email addresses, domain names, etc. For example: you have an API called/API/Orders to retrieve the latest orders, but only admin users have access to this data. In cookie-based authentication, once the request is created, the database needs to be accessed to verify that the session is correct (it is now stored in Redis, not in the database). In addition, it is necessary to obtain the user permissions in the database to verify whether the user has admin permissions (this should be checked according to the user role_id), and finally call the order information. With JWT, you can put the user’s role inside the JWT, so you can call the order information directly once the authentication passes.

Mobile platforms

Modern apis don’t just interact with browsers; properly written apis can support both browsers and native mobile platforms such as IOS or Android. Native mobile platforms are not necessarily compatible with cookies, and there are some limitations and caveats. On the other hand, token is easier to implement on IOS and Android, and token is also easier to implement iot applications and services without the concept of Cookie storage.

Third, in-depth understanding of HTTP features and usage

1. Encoding and decoding

1.1 Coding Specifications

Programmers wanted to display characters in computers, but computers could only recognize the binary numbers of 0 and 1, so international organizations developed coding specifications that wanted to use different binary numbers to represent different characters, so that computers could display their corresponding characters according to the binary numbers.

For example: GBK coding specification, the computer can convert between Chinese characters and binary numbers, and the use of GBK coding can also make the computer display Chinese characters.

1.2 Composition of coding specifications

  • Character table

The word table stores all the characters that can be displayed in the coding specification. The computer finds the characters from the word table according to the binary number and then displays them to the user. It is equivalent to a database storing characters.

  • Character set

A character set is a collection of binary addresses for each character in a character library table. For example, in the ASCII character set, the serial number (address) of the letter A is 65, and the binary number of 65 is 01000001. We can say that the coded character set is used to store these binary numbers, and the binary number is an element of the coded character set, and it is also the address of the letter A in the word table, and we can display the letter A from this address.

  • encoding

It is wasteful to display text directly using the binary address of the character. To distinguish each character, even if it is 00001111, it is allocated 4 bytes of space. So programmers create a set of algorithms to save space, and each different algorithm is called a way of coding.

The whole process: a short binary number, through an encoding method, is converted into the normal address of the encoding character set, and then find a corresponding character in the word library table, and finally display to the user.

1.3 Common coding specifications

1.3.1 ASCII

ASCII code, the earliest coding specification, contains a total of 128 characters ranging from 00000000 to 01111111, which can represent Arabic numerals and upper and lower case Letters, as well as some simple symbols. ASCII requires only 1 byte of storage space, with the highest bit being 0. It has no specific encoding method and is directly represented by the binary number corresponding to the address.

1.3.2 GBK

GBK supports all Chinese, Japanese and Korean characters in international standard ISO/IEC10646-1 and national standard GB13000-1. All characters in the GBK character set are 2 bytes, both in Chinese and English. There is no special encoding method, generally in China, Chinese characters are used more.

1.3.3 Unicode

From the above coding specifications, it can be seen that each coding specification is incompatible with each other, and can only represent the characters they need, so the International Organization for Standardization (ISO) decided to develop a universal coding specification Unicode.

Unicode contains all characters in the world. Unicode can hold up to 4 bytes of characters. That is, to distinguish each character, the address of each character needs 4 bytes. This is a waste of storage space, so programmers have designed several character encodings, such as UTF-8,UTF-16, and UTF-32.

The most widely used encoding by programmers is UTF-8, which is a variable-length character encoding. Note that UTF-8 is not an encoding specification, but an encoding method. English takes up only 1 byte after utF-8 character encoding, Chinese takes up 3 bytes.

1.4 Encoding and decoding

A string of binary numbers, using an encoding method, is converted into characters, a process we call decoding. Using the wrong encoding method, produce other unreasonable characters, this is commonly referred to as garbled code.

After a string of decoded characters, we can also choose any type of encoding method to re-convert into a string of binary numbers, this process is encoding. No matter which encoding method is used, it will eventually produce binary numbers that can be recognized by the computer. However, if the character table of the encoding specification does not contain the target character, the corresponding binary number cannot be found in the character set. This will result in irreversible garble! For example, a character library table like ISO-8859-1 does not contain Chinese characters. Therefore, even if Chinese characters are encoded using ISO-8859-1 and then decoded using ISO-8859-1, the correct Chinese characters cannot be displayed.

1.5 URL Codec

The URL is encoded in the ASCII character set

Coding rules:

  • Non – reserved characters and non – insecure characters in THE URL are not encoded.
  • For reserved and insecure characters in THE URL, you need to take the ASCII code and prefix it with % to encode the character.
  • For a non-ASCII character in a URL, take its Unicode code and prefix it with % to encode the character.

2. Identity authentication

  • HTTP authentication: developer.mozilla.org/zh-CN/docs/…
  • HTTPS two-way authentication guide: www.jianshu.com/p/2b2d1f511…
  • Basic authentication and Digest authentication: www.cnblogs.com/lsdb/p/1062…

To complement…

3. Long connection and short connection

3.1 concept

  • In HTTP/1.0, short connections were used by default. That is, each time the browser and server perform an HTTP operation, a connection is established, but the connection is broken at the end of the task.
  • Since HTTP/1.1, long connections are used by default to preserve the connection feature. HTTP with long connections adds this line to the response header:Connection:keep-alive. After the client makes an HTTP operation and establishes a connection, the web page on the server will continue to use the established connection. Keep-alive does not hold a connection forever. It has a hold time that can be set on the server. To implement long connections, both the client and server support long connections.

3.2 nature

HTTP long and short connections are essentially TCP long and short connections.

3.3 the advantages and disadvantages

  • A long connection saves many TCP setup and shutdown operations, reducing waste and saving time. Long connections are preferred for customers who frequently request resources. However, the server needs to take some policies, such as closing some connections that do not have read or write events for a long time. If the condition is allowed, it can limit the maximum number of long connections to each client based on the granularity of the client machine, so that a painful client can completely avoid the backend service.
  • Short connections are easy to manage for the server. All existing connections are useful connections, and no additional control is required. However, if the client requests frequently, time and bandwidth will be wasted on TCP setup and shutdown operations.

The generation of long and short connections depends on the shutdown policies adopted by the client and server. Specific policies are adopted in specific application scenarios. There is no perfect choice, but only appropriate choice.

3.4 pipelining

To complement…

4. HTTP proxy

4.1 General Agent

4.4.1 concept

HTTP clients send requests to proxies. The proxy server needs to handle requests and connections correctly (for example, Connection correctly: Keep-alive), while sending a request to the server and forwarding the received response to the client.

4.1.2 supplement

  1. If I access A’s website through an agent, FOR A, it will treat the agent as A client, completely unaware of the existence of the real client, which realizes the purpose of hiding the IP address of the client. Of course, the proxy can also modify the HTTP request header to tell the server the actual CLIENT IP through a custom header like X-Forwarded-IP. However, the server cannot verify that the custom header was actually added by the proxy or that the client modified the request header, so you need to be careful when retrieving the IP from the HTTP header field.
  2. Explicitly specifying the browser proxy is commonly called forward proxy. After forward proxy is enabled, the browser modifies HTTP request packets to avoid problems (such as Connection) on the old proxy server.
  3. In another case, when A visits website A, it actually visits the proxy. After receiving the request packet, the proxy sends A request to the server that actually provides the service and forwards the response to the browser. This situation is commonly referred to as a reverse proxy and can be used to hide server IP and port numbers. After the reverse proxy is used, you need to modify the DNS to resolve the domain name to the IP address of the proxy server. In this case, the browser cannot detect the existence of the real server. Reverse proxy is the most common deployment mode of Web system.

4.2 Tunnel Agent

2 the concept

Through the CONNECT method, the HTTP client requests the tunnel agent to create a TCP connection to any destination server and port, and blind forwards the subsequent data between the client and server.

4.2.2 supplement

  1. If I access website A through A proxy, the browser first makes A CONNECT request and asks the proxy to create A TCP connection to website A. Once the TCP connection is established, the proxy mindlessly forwards subsequent traffic. So this proxy, in theory, can work with any TCP-based application layer protocol, as well as the TLS protocol used by HTTPS websites. This is why such agents are called tunnels. For HTTPS, the client directly negotiates the key through the TLS handshake with the server through the proxy, so it is still secure
  2. HTTPS Related Description To be added…

5. Gateway

5.1 concept

A gateway can be used as a kind of translator that abstracts out a way to reach a resource. Gateways are the glue between resources and applications, acting as protocol converters.

5.2 Presentation Mode

Representation: < client protocol >/< Server protocol >

  • HTTP/*: The server gateway communicates with the client through HTTP and communicates with the server through other protocols.
  • */HTTP: The client gateway talks to the client through other protocols and communicates with the server through HTTP

5.3 Common Gateways

5.3.1 HTTP/* : Server-side Web gateway

As the request flows to the original server, the server-side Web gateway converts the client HTTP request to another protocol

5.3.2 HTTP/HTTPS: server-side security gateway

All incoming Web requests are encrypted through the gateway to provide additional privacy and security. Clients can browse Web content using plain HTTP, but the gateway automatically encrypts the user’s conversation

5.3.3 HTTPS/HTTP: client security accelerator gateway

You can use HTTPS/HTTP gateways as security accelerators. These HTTPS/HTTP gateways are located in front of the Web server and are often used as invisible interception gateways or reverse proxies. They receive secure HTTPS traffic, decrypt secure traffic, and send plain HTTP requests to Web servers. These gateways typically contain dedicated decryption hardware to decrypt secure traffic in a much more efficient manner than the original server, reducing the load on the original server. These gateways send unencrypted traffic between the gateway and the original server. Therefore, use caution to ensure that the network between the gateway and the original server is secure

5.3.4 Resource Gateway

The most common gateway, an application server, combines the target server with the gateway in one server. Application servers are server-side gateways that communicate with clients over HTTP and connect to server-side applications. The first popular application Gateway API was the Common Gateway Interface (CGI). CGI is a standard set of interfaces that a Web server can use to load a program in response to HTTP requests for a particular URL, collect the program’s output data, and send it back in an HTTP response

6. HTTP cache

6.1 HTTP Cache Policy

HTTP caching policy is to deal with the client and server exists the problem of asymmetric information, the client in order to speed up part will cache the resource, but the next request, the client does not know if there is any update to get the resources, the service side also don’t know is which version of the client cache, don’t know should shouldn’t return resources, is a information synchronization problems, The HTTP caching strategy is designed to solve this problem

6.2 HTTP Cache Flow Chart

6.3 Forced Cache

6.3.1 concept

Forced caching is the process of looking up the result directly from the browser cache and deciding whether to use the cache based on the result cache rules.

6.3.2 Expires

Expires is the HTTP 1.0 specification, and the value is a point-in-time string in the GMT format, such as Expires:Mon,18 Oct 2066 23:59:59 GMT. This time point represents the time when the resource expired, and if the current timestamp is before this time, the cache is judged to have been hit. One disadvantage is that the outage time is an absolute time, which can lead to cache clutter if the server time deviates from the client time significantly. It’s not uncommon for the server’s time to be different from the user’s actual time, so Expires can be a bit of a problem in practice.

6.3.3 Cache-Control

Cache-control is an HTTP 1.1 specification. It is usually judged by the max-age value of this field, which is a relative time. For example, cache-control :max-age=3600 indicates that the validity period of a resource is 3600 seconds. The Date in the return header indicates the time when the message is sent, indicating that the current resource is valid from Date to Date +3600s.

  • no-cacheInstead of using a mandatory cache, use a negotiated cache
  • no-storeNothing is cached, either by force or by negotiation
  • publicAll content will be cached (client/proxy /CDN etc.)
  • privateOnly clients can Cache, cache-control is the default
  • max-age=xxxThe cache will expire in XXX seconds

6.4 Negotiated Cache

6.4.1 concept

Negotiation cache is a process in which the browser sends a request to the server with the cache id after the cache is invalid. The server decides whether to use the cache based on the cache ID.

It mainly involves two pairs of attribute fields, both of which appear in pairs. That is, the response header of the first request carries last-Modified or Etag, and the subsequent request carries the corresponding request field if-Modified-since or if-none-match

6.4.2 Last – Modifed/If – Modified – Since
  • Last-Modified: Response header, the server tells the browser when the resource was last modified
  • If-Modified-Since: Returns from the previous request when the client initiates the request againLast-ModifiedValue that tells the server when the resource was last modified when it was returned from the last request. The server receives the request, based on theIf-Modified-SinceField value is compared to the last time the resource was modified on the server, if the file modification time is inIf-Modified-SinceThen, the resource is returned with the status code of 200. Otherwise, 304 is returned, indicating that the resource is not updated and the cached file can continue to be used.
6.4.3 Etag / If-None-Match
  • Etag: Response header, the unique identifier of the resource, told by the server to the browser
  • If-None-Match: Returns from the previous request when the client initiates the request againEtagValue. The server receives the packetIf-None-MatchAfter the field, determine whether the file contents have been changed by comparing the unique identifier string of the resource. If it changes, the resource is returned with the status code of 200. Otherwise, 304 is returned, indicating that the resource is not updated and the cached file can continue to be used.
6.4.4 Advantages of Etag
  • Some files may change periodically, but the content does not change (only the modification time). At this point, we do not want the client to think that the file has been modified and GET again;
  • Some files change very frequently (say N times in 1s),If-Modified-SinceProblems arise because the granularity that can be examined is second. whileEtagThis ensures that the client can flush the cache N times in 1 second.
  • Some servers do not know exactly when a file was last modified.

6.5 Cache Improvement Schemes

6.5.1 md5 hash/cache

By adding MD5 or hash ids to static files without caching HTML, the browser can proactively detect file changes when the cache expires

Tactical fix packs for 6.5.2 CDN cache

CDN is a content distribution network, which relies on edge servers deployed in various places to enable users to obtain the content they need nearby and reduce network congestion.

When the browser detects that the cache expiration is mandatory, it sends a request to the CDN edge node. The CDN edge node will detect whether the cache of the requested data has expired. If not, the CDN edge node will directly respond to the user’s request. If the data has expired, the CDN also needs to issue a back to the source request to pull the latest data.

7 Content negotiation mechanism

7.1 concept

A particular document is called a resource. When a client obtains a resource, it sends a request using its corresponding URL. The server uses this URL to select a variant of the resource it points to — each variant is called a representation — and returns the selected representation to the client. The entire resource, along with its various representations, shares a specific URL. When a resource is accessed, the selection of specific presentation forms is determined through content negotiation mechanisms, and there are multiple negotiation methods between client and server.

7.2 classification

7.2.1 Server Driver (Mainstream)

The server examines the client’s request header and decides which version of the resource to provide

7.2.2 Client Driver (Not used much)

The client initiates a request, the server returns a list of available resources, and the client makes a second selection to request the desired resource

7.2.3 Transparent Proxy (Rarely used)

An intermediate device negotiates on behalf of the server

7.3 Related Request headers and Response Headers

7.3.1 request header
  • Accept: Tells the server what media type to send
  • Accept-Language: Tells the server what language to send
  • Accept-Charset: Tells the server what character set to send
  • Accept-Encoding: Tells the server which encoding to use
7.3.2 response headers
  • Content-Type: Indicates the returned media type
  • Content-Language: The returned language
  • Content-Charset: Returns the character set
  • Content-Encoding: Return code

7.4 Approximate Matching

Quality values are defined in the HTTP protocol, allowing clients to list multiple options for each preference category and associate a priority for each preference. For example, a client could send an Accept-language header of the following form, with q values ranging from 0.0 to 1.0

Accept-Language: zh; Q = 1.0, en. Q = 0.5, nl; Q = 0.2, tr. Q = 0.0Copy the code

8 Resumable data on a breakpoint

  • Request header:Range.If-Range
  • Header:Accept-Ranges.Content-Range

8.1 Whether range requests are supported

In HTTP/1.1, a response header is explicitly declaredAccess-RangesTo mark whether a range request is supported, which has only one optional parameterbytes. For example, given an MP4 response header, you can see that it is marked with Accept-Ranges:bytes, indicating that the current resource supports range requests.

8.2 Use range request

If it is determined that both ends support scope requests, we can use it when requesting resources.

All files are ultimately bytes stored on disk or in memory, which can be divided into bytes for files to be operated on. This only requires HTTP support to request resources in the range n to n+x of the file.

HTTP/1.1 defines a Ranges request header to specify the range of request entities. It ranges from 0 to Content-Length, using -split. For example, you have downloaded 1000 bytes of resource content. If you want to continue downloading the subsequent resource content, simply add Ranges:bytes=1000- to the HTTP request header.

Range also has several different ways of limiting the Range, which can be flexibly customized as needed:

  1. 500-1000: Specifies the start and end ranges, generally used for multithreaded downloads.
  2. 500- : specifies the start range and continues until the end. This is more suitable for resumable, or online playback, etc.
  3. -500: No start interval, meaning only that the last 500 bytes of content entities are required.
  4. 100-300,1000-3000: Specify multiple ranges. There are few scenarios used in this way, so it’s good to know.

HTTP protocol is a bilateral negotiation protocol. Since the request header has been determined to use Ranges, the content-Ragne response header is also needed to mark the entity Content range of the response in the response header. The content-range format is also clear, marking it first in bytes and then the Range and total length of the Content entities currently being passed.

Content-Range: bytes 100-999/1000
Copy the code

8.3 Resource Changes

When we download a large resource in some download tools, we may occasionally pause and then download again, and it may start again.

It looks like the HTTP scope request is invalid, but it doesn’t have to be, most likely because the requested resource has changed during the request.

If you download, download the source of the resource file has changed, but the URL has not changed, as the file size may change (this is very easy to find), and in extreme cases even if there is no length changes, you continue to download again, probably eventually after the download is complete, unable to download the contents of the joining together into the file we need.

In HTTP Range requests, you can use ETag or last-Modified to distinguish whether the segment request resource has been Modified by placing it in the if-range request header in the request header. If both operations are performed on the same resource file, the 206 status code is returned to start subsequent operations; otherwise, the 200 status code is returned, indicating that the file has been changed and must be downloaded from the beginning.

8.4 Summary and process

  1. HTTP range requests must be supported by HTTP/1.1 or higher. If a certain segment of the two ends is earlier than this version, the request is not supported.
  2. Through the response headerAccept-RangesTo determine whether range requests are supported.
  3. By adding in the request headerRangeThis request header specifies the byte range of the requested content entity.
  4. In the response header, passContent-RangeTo identify the scope of content entities currently returned and useContent-LengthTo identify the length of the content entity scope currently returned.
  5. During the request process, you can passIf-RangeTo distinguish whether the resource file has changed, its value comes fromETagLast-Modifled. If the resource file is changed, the download process will be restarted.

Third, HTTPS

3.1 The biggest drawback of HTTP is insecurity

During HTTP data transmission, all data is transmitted in plain text, so there is no security at all, especially some sensitive data, such as user passwords and credit card information. The only way to make HTTP more secure is to use real encryption algorithms, because encryption algorithms can encrypt or restore data with keys. Just make sure the keys are not accessed by a third party, and that’s what HTTPS does.

3.2 Encryption Algorithm

HTTPS uses encryption algorithms to solve data transmission security problems. Specifically, it is a hybrid encryption algorithm, that is, a mixture of symmetric encryption and asymmetric encryption. It is necessary to first understand the differences, advantages and disadvantages of the two encryption algorithms.

3.2.1 Symmetric Encryption

Symmetric encryption, as its name implies, uses the same key for encryption and decryption. Common symmetric encryption algorithms include DES, 3DES, and AES

  • Advantages: Open algorithm, small amount of calculation, fast encryption speed, high encryption efficiency, suitable for encrypting large data.
  • Disadvantages: Both sides of the transaction need to use the same key, so it is impossible to avoid the key transmission. The key cannot be intercepted during transmission, so the security of symmetric encryption cannot be guaranteed

Encrypted data is garbled during transmission. Even if intercepted by a third party, the data cannot be decrypted without a key, ensuring data security. But there is a fatal problem. Since both parties want to use the same key, it must be passed from one party to the other before the data can be transmitted, in which case the key can be intercepted and the encrypted data can be easily decrypted.

3.2.2 Asymmetric Encryption

Asymmetric encryption, as its name implies, requires encryption and decryption using two different keys: a public key and a private key. The public key and private key are a pair. If the public key is used to encrypt data, only the corresponding private key can be used to decrypt data. If data is encrypted with a private key, only the corresponding public key can be used to decrypt it. The basic process of asymmetric encryption algorithm to realize confidential information exchange is as follows: Party A generates a pair of keys and discloses one of them as a public key; Party B who has obtained the public key encrypts the confidential information with the public key before sending it to Party A; Party A then uses its own private key to decrypt the encrypted information. The commonly used asymmetric encryption algorithm is RSA algorithm, which has the following advantages and disadvantages:

  • Advantages: The algorithm is open, encryption and decryption use different keys, the private key does not need to be transmitted over the network, high security.
  • Disadvantages: large amount of calculation, encryption and decryption speed is much slower than symmetric encryption.

In the process of the above, the client to the server’s public KEY, generates a random code (both expressed with the KEY, the KEY is the follow-up for symmetric encryption KEY), then client using public KEY encryption and then sent to the server, the server using the private KEY to decrypt, so that both sides have the same KEY KEY, The two parties then use the KEY to symmetrically encrypt the interactive data. In asymmetric encryption transmission, even if a third party obtains the public KEY and encrypted KEY, it cannot crack the KEY without the private KEY (the private KEY is stored on the server and the risk of disclosure is minimal), thus ensuring the security of the following symmetric encryption data. The above flow chart is the prototype of HTTPS, which combines the advantages of the two encryption algorithms to ensure communication security and data transmission efficiency.

3.3 the HTTPS definition

Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It is used for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security (TLS) or, formerly, its predecessor, Secure Sockets Layer (SSL). The protocol is therefore also often referred to as HTTP over TLS, or HTTP over SSL.

Hypertext Transfer Protocol Secure (HTTPS) is an extension based on HTTP. It is used for Secure communication on computer networks and is widely used on the Internet. In HTTPS, the original HTTP protocol is encrypted with TLS (Secure Transport Layer Protocol) or its predecessor, SSL (Secure Sockets Layer). Therefore, HTTPS is often used to refer to HTTP over TLS or HTTP over SSL.

HTTPS is not an independent communication protocol, but an extension of HTTP to ensure communication security. The relationship between HTTPS and HTTP is as followsHTTPS = HTTP + SSL / TLS

3.4 HTTPS Encryption and decryption Process

  1. The client requests an HTTPS url and then connects to port 443 of the server (the HTTPS default port, which is similar to HTTP port 80).
  2. A server that uses HTTPS must have a Certification Authority (CA) certificate. The certificate must be applied for and issued by a dedicated DIGITAL certificate Authority (CA) after strict verification. A private key and a public key are generated when a certificate is issued. The private key is kept by the server itself and cannot be disclosed. The public key is attached to the information of the certificate and can be made public. The certificate itself also comes with a certificate electronic signature, which verifies the integrity and authenticity of the certificate and prevents the certificate from being tampered with.
  3. The server responds to the client’s request by passing the certificate to the client, which contains the public key and a lot of other information, such as certificate authority information, company information, and certificate validity period. For Chrome, click the lock icon in the address bar and then click the certificate to see the certificate details.

4. The client parses the certificate and authenticates it. If there is nothing wrong with the certificate, the client retrieves the server’s public key A from the server certificate. The client also generates A random code KEY and encrypts it using the public KEY A. But if the certificate is not issued by a trusted authority, or the domain name in the certificate is inconsistent with the actual domain name, or the certificate has expired, a warning is displayed to the visitor and he or she can choose whether to continue the communication. Like this:5. The client sends the encrypted random KEY to the server as the symmetric encryption KEY. 6. After receiving the random KEY, the server uses the private KEY B to decrypt it. After these steps, the client and server finally establish a secure connection, perfect solution to the symmetric encryption key leakage problem, then you can use symmetric encryption to communicate happily. 7. The server uses the random KEY to symmetrically encrypt data and sends the data to the client. The client uses the same KEY to decrypt data. 8. Both parties happily transfer all data using symmetric encryption.

3.5 Differences between HTTPS and HTTP

  • The most important difference is security. HTTP transmits in plaintext and is less secure without encrypting data. HTTPS (HTTP + SSL/TLS) data transmission is encrypted and secure.
  • To use HTTPS, you need to apply for a CA certificate. Generally, there are few free certificates, so some fees are required. Certificate authorities such as Symantec, Comodo, DigiCert and GlobalSign.
  • HTTP pages respond faster than HTTPS pages.
  • Since HTTPS is an HTTP protocol built on top of SSL/TLS, it is more costly to the server than HTTP.
  • HTTPS and HTTP use completely different connections and use different ports, 443 and 80.

3.6 HTTPS Usage Cost

  • Certificate fee and update maintenance
  • The user access speed is reduced
  • More server resources are required, resulting in higher costs

3.7 HTTPS Handshake details

1. TCP three-way handshake

2. The client sends client_hello

  1. TLS Version Information
  2. Random number (for subsequent key negotiation) random_C
  3. Encryption suite candidate list
  4. Compression algorithm candidate list
  5. Extension field
  6. other

3. The server sends server_hello

After receiving client_hello from the client, the server sends server_hello and returns the result of the negotiation:

  1. Select TLS protocol version version
  2. Random number random_S
  3. Cipher Suite is selected
  4. The compression algorithm chosen is compression method
  5. other

4. The server sends the certificate

After sending server_hello, the server starts sending its own certificate. For example, as shown in the figure, the length of the packet containing the certificate is 3761, so the packet is segmented in TCP, and the certificate is sent in three packets [TCP fragment].

5. (Optional) The Server sends the Server Key Exchange.

Depending on the encryption algorithm, it may not be sent

6. The Server sends Server Hello Done

Notifies the client server_hello that sending information is complete

7. The client sends client_KEY_exchange +change_cipher_spec+encrypted_handshake_message

  1. Client_key_exchange, once authenticated, sends its own public key parameter to the server, where the client has actually calculated the key
  2. Change_cipher_spec: The client notifies the server of subsequent communication using the negotiated communication key and encryption algorithm for encrypted communication
  3. Encrypted_handshake_message is used to test the validity and consistency of the key

8. The server sends a New Session Ticket

The server gives the client a session, which is used for a period of time (before the timeout period) when both parties communicate with a negotiated key.

9. The server sends change_cipher_spec

The server decrypts the parameters sent by the client, computes the negotiated key using the same algorithm, verifies the validity with the encrypted_handshake_message sent by the client, and sends the encrypted_handshake_message to inform the client that it is ready to communicate with the negotiated key

10. The server sends the encrypted_handshake_message

The purpose is also to test the validity of the key. The client sends the packet to verify that the client can decrypt and the client can encrypt the packet. Conversely, the server sends the packet to verify that the client can decrypt and the server can encrypt the packet

11. Complete key negotiation and start sending data

The data is also sent in segments

12. TCP wave four times after data is sent

3.8 supplement

3.8.1 TCP Fragment and IP FragmentThe resources

TCP packet segments are segmented when they are sent and reconstructed when they are received. Similarly, IP packets whose length exceeds a certain value will be fragmented. Both segmentation and sharding are data sharding to transmit data delivered by the upper layer that exceeds the upper limit of the transmission capacity of the layer.

  • Segmentation refers specifically to the behavior of data segmentation that occurs at the transport layer using TCP
  • Sharding refers to data sharding at the IPv4 network IP layer

3.8.2 Maximum Transmission Unit (MTU)

The MTU of each link may be different. The MTU of an end-to-end path is determined by the MTU of the link with the smallest MTU on the path.

MTU is a limit on data frames imposed by the network at the link layer. For example, the MTU is usually 1500 bytes. The MTU is a limit for Layer 2 protocols. Different layer 2 protocols may have different values. When the Layer 2 protocol is Ethernet, the MTU is usually set to 1500 bytes. For example, communication ports or network ports that are directly connected through the same cable must have the same MTU.

If the length of an IP packet is larger than the MTU value of the current link, the packet is fragmented to ensure that the length of each packet does not exceed the MTU value. IP datagrams transmitted by sharding do not necessarily arrive in order, but the information in the IP header allows these datagrams to be assembled in order. IP datagrams are sharded and reassembled at the NETWORK IP layer.

3.8.3 Maximum Segment Size (MSS)

The MSS field in the TCP header option is used to notify the peer end during the TCP three-way handshake. Generally, the MSS value of a TCP connection is the smaller MSS value of the communication party. The conversion relationship between MSS and MTU is as follows:

MTU = MSS + TCP header length + IP header length

Therefore, on the Ethernet (IPv4 is used as an example) :

MSS = Ethernet MTU-TCP header length – IPv4 header length = 1500-20-20 = 1460 bytes

If MSS is not specified, the default value is 536 bytes. This is because the standard MTU value on the Internet is 576 bytes. The 576 bytes MTU = 20 bytes in the TCP header + 20 bytes in the IPv4 header + 536 bytes in the MSS.

When an application sends data larger than the MSS size, it segments the data (in this case, TCP segments) so that each segment does not exceed the MSS size. TCP segments transmitted in fragments may not arrive in order. However, the TCP protocol that implements reliable transmission has a mechanism to deal with out-of-order. That is, data is rearranged in the receiving buffer by using the sequence number of packet segments. The reassembly of TCP segments is done at the TCP transport layer.

3.8.4 Segmentation and Sharding

After TCP fragmentation is performed, the sender must not fragment TCP packets at the IP layer, because MSS is derived from MTU, and TCP fragmentation meets the MSS limit, which also meets the physical limit of MTU. However, IP fragmentation may occur even after TCP fragmentation occurs. This is because TCP fragmentation only meets the MTU requirements on both ends of the communication. If the transmission path passes through a link with a smaller MTU value, the device forwarding the fragment to the link will fragment again based on a smaller MTU value. Of course, if two communication hosts are directly connected, the MTU value obtained through TCP connection negotiation (the smaller MTU value of both network adapters) is the END-TO-END path MTU value. Therefore, AS long as the sending end performs TCP segmentation, IP fragmentation will not occur in the whole communication process.

Four, based on HTTP function add-on protocol

4.1 HTTP Bottlenecks

  • Header blocking: Only one request can be sent over a TCP connection, and subsequent requests are queued until the previous one is completed
  • Multiple TCP connections: Although HTTP/1.1 pipelining can support concurrent requests, it is difficult to implement in browsers and chrome, Firefox, and others have pipelining disabled. So version 1.1 relies on multiple TCP connections for concurrent requests, which can be costly to establish and slow to start
  • Requests only start from the client
  • Request and response headers are sent without compression
  • Sending the same header each time causes more waste

4.2 the WebSocket

2 the concept

WebSocket is a protocol for full duplex communication over a single TCP connection.

In the WebSocket API, the browser and server only need to complete a handshake to create a persistent connection and two-way data transfer.

4.2.2 features

  • Support two-way communication, more real-time
  • Flexible and efficient
  • Less control overhead. After the connection is created, when the WS client and server exchange data, the packet header controlled by the protocol is small

4.2.3 How do I Set up a WebSocket Connection

1. Client: Apply for the protocol upgrade

First, the client initiates a protocol upgrade request. As you can see, the standard HTTP packet format is adopted and only the GET method is supported.

GET / HTTP / 1.1
Host: localhost:8080
Origin: http://127.0.0.1:3000
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: w4v7O6xFTi36lq3RNcgctw==
Copy the code

The significance of the key request header is as follows:

  • Connection: Upgrade: indicates that the protocol needs to be upgraded
  • Upgrade: websocket: indicates that the webSocket protocol is to be upgraded.
  • Sec-WebSocket-Version: 13: indicates the websocket version. If the server does not support the version, you need to return an sec-websocket-versionheader containing the version number supported by the server.
  • Sec-WebSocket-Key: is compatible with sec-websocket-Accept at the end of the server response header, providing basic protection against malicious or unintentional connections.
2, server: response protocol upgrade

The status code 101 indicates protocol switchover. The protocol upgrade is completed here, and subsequent data interaction is based on the new protocol

HTTP/1.1 101 Switching Protocols Connection:Upgrade Upgrade: websocket Sec- websocket-Accept: Oy4NRAQ13jhfONC7bP8dTKb4PTU=Copy the code
3. Calculation of sec-websocket-accept

Sec-websocket-accept Is calculated based on the sec-websocket-key in the header of the client request. The calculation formula is:

  1. willSec-WebSocket-Key258EAFA5-E914-47DA-95CA-C5AB0DC85B11Joining together.
  2. The digest is computed by SHA1 and converted to a Base64 string.

The pseudocode is as follows:

>toBase64( sha1( Sec-WebSocket-Key + 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 )  )
Copy the code

Verify the previous result:

const crypto = require('crypto');
const magic = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11';
const secWebSocketKey = 'w4v7O6xFTi36lq3RNcgctw==';

let secWebSocketAccept = crypto.createHash('sha1')
	.update(secWebSocketKey + magic)
	.digest('base64');

console.log(secWebSocketAccept);
// Oy4NRAQ13jhfONC7bP8dTKb4PTU=
Copy the code

4.2.4 Function of sec-websocket-key /Accept

As mentioned earlier, sec-websocket-key/sec-websocket-Accept is used to provide basic protection against malicious and unexpected connections.

The functions are summarized as follows:

  • Prevent the server from receiving invalid WebSocket connections
  • Make sure the server understands websocket connections. Since the WS handshake phase uses HTTP, it is possible that the WS connection is processed and returned by an HTTP server, in which case the client can use sec-websocket-key to ensure that the server is aware of the WS protocol.
  • Sec-websocket-key and other related headers are disallowed when setting headers for ajax requests in the browser. This avoids websocket upgrade when the client sends ajax requests.
  • This prevents the reverse proxy (which does not understand the WS protocol) from returning incorrect data. For example, when the reverse proxy receives two ws connection upgrade requests, it returns the first one to the cache, and then returns the second one directly to the cache (meaningless return).
  • The main purpose of SEC-websocket-key is not to ensure data security, because the calculation formula of sec-websocket-key and SEC-websocket-accept conversion is public and very simple, and the main function is to prevent some common accidents (unintentional).

Note: sec-websocket-key/sec-websocket-accept conversions only provide a basic guarantee, but there is no actual guarantee.

4.2.5 Application Scenarios

  • Instant chat communication
  • Multi-player games
  • Online co-editing/editing
  • Pull and push of real-time data stream
  • Sports/games live
  • Real-time map location

4.3 SPDY (replaced by HTTP2)

SPDY is not a new protocol, but rather a session layer that precedes HTTP.

Some improvements to SPDY:

  • Multiplexing, request optimization
  • Support server push technology
  • Compressed HTTP headers
  • Force the use of SSL transport protocol
  • You can set the resource priority

4.4 HTTP2.0

HTTP2.0 has ported many of the features in SPDY

  • Stream: a bidirectional byte Stream over an established TCP connection that can carry one or more messages.
  • Message: A complete HTTP request or response, consisting of one or more frames. Frames for a particular message are sent on the same stream, which means that an HTTP request or response can only be sent on one stream.
  • Frame: The basic unit of communication.

There can be any number of streams on a TCP connection.

4.4.1 Binary frame splitting layer

At the heart of HTTP2’s performance improvement lies the binary framing layer. HTTP2 is a binary protocol that transmits data in binary format rather than 1.x text format. 1.1 The response is a text format, while 2.0 divides the response into two frames. HEADERS and DATA are the types of frames in the figure. That is, an HTTP response is transmitted in two frames and encoded in binary.

4.4.2 Multiplexing

As mentioned above, HTTP/1.1 has header blocking and multiple TCP connections, whereas HTTP/2 multiplexing allows concurrent multiple request-response messages to be sent simultaneously over a single TCP connection. HTTP2 establishes a TCP connection. A connection can have any number of concurrent streams. Messages are transmitted in one or more frames. We can continuously send frames to each other. The Stream identifier of each frame identifies which stream the frame belongs to, and then, when received by the other party, we can combine all frames of each stream according to the Stream identifier to form a whole block of data. So http2 only needs to create one connection for the same domain, instead of six to eight connections as HTTP /1.1 does.

A new problem with multiplexing is that critical requests may be blocked on the basis of connection sharing, so need toPriority mechanism. Each stream in HTTP2.0 can be prioritized, and streams with higher priorities are processed by the server and returned to the client. For example, when the browser loads the home page, the HTML content of the home page should be displayed first, and then all kinds of static resource files and script files are loaded, so as to ensure that users can see the content of the web page in the first time.

4.4.3 Header Compression

In 1.x, the header is transferred in text format, which typically adds 500-800 bytes of overhead per transfer. Now it is normal to open a webpage with hundreds of requests, and some header fields of each request are the same, such as cookie and user-agent. HTTP2 uses the HPACK compression format to compress the header. Header compression needs to be between the browser and the server:

  • Maintain an identical static dictionary of common header names and combinations of common header names and values
  • Maintain the same dynamic dictionary that can be added dynamically
  • The header field of the transmission is encoded by static Huffman encoding

Partial static dictionary for HTTP2:

So when we transfer the header field, for example method:GET, we only need to transfer the index value of method:GET in the static dictionary, which is one byte. In static dictionaries such as User-Agent and cookie, there is only a header name but no value. The first transmission requires the index of User-Agent in the static dictionary and its value, and the value will adopt static Huffman encoding to reduce the volume.

After the first transfer of user-Agent, the browser and server side add it to their dynamic dictionary. The index can then be transferred in a single byte.

4.4.4 Server Push

Server-side push enables the server to predict the resources needed by the client and push them to the client actively. For example, if the client requests index.html, the server can push script.js and style.css. The implementation principle is that when the client sends a page request, the server can analyze the other resources that the page depends on and actively push them to the client’s cache. When the client receives the request of the original web page, the resources it needs are already in the cache.

For each resource it wishes to send, the server sends a PUSH_PROMISE frame, and the client can reject a push by sending a RST_STREAM frame (when the resource is already in the cache). This step takes place before the parent response (index.html), and the client knows what resources the server is trying to push, so it doesn’t create duplicate requests for those resources. When the client receives the index.html response, script.js and style.css are already in the cache.

4.5 QUIC&HTTP 3.0

4.5.1 Concepts and Background

1. What is QUIC

QUIC, or Quick UDP Internet Connections, is an experimental network transport protocol proposed by Google and located at the transport layer of the OSI model. QUIC aims to solve the defects of TCP and eventually replace TCP to reduce data transmission, reduce connection establishment latency and speed up web page transmission.

2. Why UDP
  • UDP connectionless (no cost for establishing and ending a connection)
  • UDP data is out of order and there is no correlation between datagrams (no queue header blocking problem)
  • UDP protocol simple (low modification cost)

Several features of UDP perfectly meet our requirements, but the only drawback is that UDP cannot be as reliable as TCP. Therefore, Google decided to transform a new protocol with TCP advantages on the basis of UDP – QUIC protocol.

3. Understand HTTP3

The HTTP protocol that runs on top of QUIC is called HTTP/3 (HTTP-over-QUIC). The QUIC protocol is based on UDP, and QUIC also integrates the advantages of TCP, TLS, and HTTP/2.

  • Features:
  1. Reduced handshake delay (1-RTT or 0-RTT)
  2. Multiplexing, and without the blocking problems of TCP
  3. Connection migration, (mainly on the client side) when moving from Wifi to 4G, the connection will not be disconnected
  4. TLS 1.3 encryption is integrated
  • HTTP/3 is not directly related to HTTP/1.1 and HTTP/2, nor is it an extension of HTTP/2
  • HTTP/3 will be a whole new WEB protocol
  • HTTP/3 is currently in the development and testing stages

4.5.2 Multiplexing

Queue head blocking is actually a very common problem in computer networks. Its main performance is that one packet affects a bunch of packets. If it does not arrive, it will block the shadow all the time, thus affecting other packets. This problem occurs in both HTTP1.x and TCP.

When HTTP1.1 introduced pipelining, because it manages requests through queues, if the response order is inconsistent with the request, it will block at the head of the queue. In order to solve this problem, HTTP2.0 introduced a multiplexing mechanism, by identifying the corresponding request information in the header, to solve the problem of queue head blocking, improve channel utilization.

Although HTTP solves header blocking, the TCP layer still has header blocking. To ensure that data is delivered to the upper layer in an orderly manner, TCP waits for all data to arrive and then sorts and integrates it. Once a packet is lost, it must wait for retransmission, thus blocking the data usage of the entire connection.

To solve this problem, QUIC adopts the idea of multiplexing, a connection can carry multiple streams at the same time, can initiate multiple requests at the same time, and these requests are completely independent, the blocking of one request will not affect the other requests.

4.5.3 Low Wait delay (0RTT)

Round-trip Time (RTT) refers to the Time consumed by data packets to and from a network. RTT is a common indicator used to measure network connection establishment. It consists of round-trip propagation delay, queuing delay within network devices, and data processing delay of applications.

Different from the traditional HTTPS protocol 3RTT, QUIC can establish a connection through 0RTT, and the first data packet can transmit data.

4.5.4 Connection mode

QUIC’s 0RTT is not unconditional, because for both parties interacting for the first time, 1RTT is also required for key negotiation (DH algorithm). Therefore, QUIC connection construction can be divided into two cases: first connection and non-first connection.

4.5.4.1 Initial Connection

Its main content is the key negotiation and data transmission between client and server. The basic process is as follows:

  1. The client sends a Client Hello request to the first connected server.

2, the server generates a prime number P and an integer g, and generates a random number as the private key, and then calculates the public key. The server packages the public key, P, and G as config and sends the package to the client. 3. The client randomly generates its own private key, reads g and P from config, and calculates its own public key. 4. The client uses its own private key and the server public key read from config sent by the server to generate the key K for subsequent data encryption. 5. The client uses the key K to encrypt service data and adds its own public key, which is transmitted to the server. 6. The server generates the client encryption key K based on its private key and the client public key. 7. To ensure data security, the generated key K is generated and used only once. Subsequently, the server generates a new set of public and private keys based on the same rules, and uses the public and private keys to generate a new key M. 8. The server sends the data encrypted by the new public key and the new key M to the client. The client calculates the key M based on the new public key and its original private key and decrypts it. 9. The subsequent data interaction between the client and server is completed using key M, and key K is used only once.

4.5.4.2 Non-initial Connection

In the first connection, the client stores the server config (server public key, random prime p, and random integer G). Therefore, the client can directly use config to calculate the communication key in the subsequent connection, thus skipping the 1RTT of key negotiation and realizing 0RTT data exchange.

To ensure security, the client saves the config for a certain period of time, so the key exchange for the first connection is still required after the config fails.

4.5.5 Forward Security

Forward Secrecy is a security attribute of communication protocols in cryptography. It means that the leakage of the master key used for a long time will not lead to the leakage of the previous session key. Forward security protects past communications from future exposure of passwords or keys. If the system has forward security, it can guarantee the security of historical communication in case of master key leakage, even if the system is attacked actively.

To put it simply, forward security means that the previously encrypted data will not be leaked even if the key is leaked, and only the current data will be affected.

As shown in the PRECEDING DH encryption process, to ensure data security, the key used in each communication is used only once. After each interaction, the key is destroyed and regenerated according to the same rules. Even if the key is leaked, the other party can only obtain the information corresponding to the key, and other data is not affected.

4.5.6 Forward error correction

Forward error correction is a way of error control, it refers to the signal before being admitted to a transmission channel coding process in advance according to certain algorithm, joined with the signal itself characteristics of redundant code, the received signal at the receiving end according to the corresponding algorithm for decoding, the error message to find out in the process of transmission technology and correct it

When packet loss occurs, TCP uses the retransmission mechanism, while QUIC uses the forward error correction mechanism.

  • For TCP, if packet loss occurs, a delay is required to determine whether packet loss occurs, and then the retransmission mechanism is enabled. This process causes some congestion and affects the transmission time.
  • QUIC performs xOR on each set of data it sends and sends the result as a checksum packet. If packet loss occurs in the previous set of data, the lost packet data can be restored through checksum and other packets to avoid loss caused by retransmission.

4.5.7 Connection Migration

In the current mobile network environment, the user’s network may change at any time, for example, from the mobile network to WIFI environment, at this time our IP address will change. TCP uses a quintuple (source IP address, source port, destination IP address, destination port, and transport layer protocol) to uniquely represent a connection. Therefore, the previous connection cannot be maintained and a NEW TCP connection needs to be established.

In order to solve this problem, based on the realization of the UDP QUIC completely abandon the concept of five yuan group, its by generating a random number as a 64 – bit connection identifier, even when we switch the IP address, the logo will not have any change, we can quickly restore the connection, greatly improve the mobile client application experience.

5. WEB security parsing

5.1 SQL injection

Because parameterized queries can reuse execution plans, and if they reuse execution plans, the semantics expressed by SQL will not change, you can prevent SQL injection, which can occur if execution plans cannot be reused.

5.2 XSS(Cross-site Scripting Attacks)

An overview of the

XSS attack usually refers to the use of the vulnerabilities left in the development of the web page, through a clever way to inject malicious command code into the web page, the user load and execute the malicious web program made by the attacker. After a successful attack, the attacker may gain various contents including but not limited to higher permissions (such as performing some operations), private web content, sessions and cookies.

The essence of XSS: Malicious code is unfiltered and mixed in with the site’s normal code; Browsers cannot tell which scripts are trusted, causing malicious scripts to be executed.

classification

Reflective XSS

Reflective XSS vulnerabilities are common in functions that pass parameters through urls, such as website search, jump, etc. Because users need to take the initiative to open malicious URL to take effect, attackers often combine a variety of means to induce users to click.

Attack steps:

  1. The attacker constructs a special URL that contains malicious code.
  2. When a user opens a URL with malicious code, the web server takes the malicious code out of the URL, splices it into HTML and returns it to the browser.
  3. When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
  4. Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

Type stored XSS

The cause of stored XSS vulnerability is similar to the root cause of reflective vulnerability, except that malicious code is stored on the server, causing other users (front end) and administrators (front and back end) to execute malicious code while accessing resources.

Attack steps:

  1. The attacker submits malicious code to the database of the target website.
  2. When the user opens the target website, the website server takes the malicious code out of the database, splices it into HTML and returns it to the browser.
  3. When the user’s browser receives the response, it parses it and executes the malicious code mixed in.
  4. Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

The DOM model XSS

DOM TYPE XSS attack, the extraction and execution of malicious code is completed by the browser side, which is a security vulnerability of front-end JavaScript itself, while the other two XSS are security vulnerabilities of the server side.

Attack steps:

  1. The attacker constructs a special URL that contains malicious code.
  2. The user opens a URL with malicious code.
  3. When the user’s browser receives the response, it parses it and executes it. The front-end JavaScript picks up the malicious code in the URL and executes it.
  4. Malicious code steals user data and sends it to the attacker’s website, or impersonates the user’s behavior and calls the target website interface to perform the operations specified by the attacker.

XSS prevention

XSS attacks have two main elements:

  1. Users submit malicious code
  2. The browser executes malicious code
The input filter

The back end filters the input before writing it to the database and returns “safe” content to the front end. But there was a problem: during the commit phase, we weren’t sure where to output the content, and the format was messy after escaping. Therefore, input filtering only applies to checking mobile phone numbers and email addresses.

HTML escaping

If concatenating HTML is necessary, you need to use a suitable escape library to adequately escape the insertion points of the HTML template.

Pure front-end rendering

Pure front-end rendering process:

  • The browser first loads a static HTML that does not contain any business-related data.
  • The browser then executes the JavaScript in the HTML.

JavaScript loads the business data through Ajax and calls the DOM API to update it to the page.

In a pure front-end rendering, we explicitly tell the browser whether to set a text (.innertext), an attribute (.setAttribute), a style (.style), etc. Browsers can’t easily be tricked into executing unexpected code. However, pure front-end rendering also needs to avoid DOM XSS vulnerabilities.

In many internal and management systems, pure front-end rendering is perfectly appropriate. However, for pages with high performance requirements or SEO requirements, we still face the problem of concatenated HTML.

Prevents DOM XSS attacks

DOM TYPE XSS attack is actually the site’s front-end JavaScript code itself is not strict enough, the untrusted data as code execution.

Be careful when using.innerhtml,.outerhtml, and document.write(). Do not insert untrusted data into the page as HTML. Instead, use.textContent,.setAttribute(), etc.

If using the Vue/React technology stack, and do not use the v – HTML/dangerouslySetInnerHTML function, on the front end render phase avoid innerHTML, outerHTML XSS concerns.

Inline event listeners in the DOM, such as location, onClick, onError, onload, onmouseover, etc. JavaScript eval(), setTimeout(), setInterval(), etc., can all run strings as code. If untrusted data is concatenated into strings and passed to these apis, it is easy to create a security risk that must be avoided

Content Security Policy

Strict CSP can play the following roles in XSS prevention:

  • Prohibit loading outfield code to prevent complex attack logic.
  • Prohibit submission from the outdomain. After a website is attacked, user data will not be leaked to the outdomain.
  • Forbid inline script execution (strict rules, currently found to be used on GitHub).
  • Disable unauthorized script execution (new feature, in use with Google Map Mobile).
  • Proper use of reports can discover XSS in a timely manner, which helps to rectify problems as soon as possible.

5.3 CSRF(Cross-site Request Forgery)

concept

Cross-site Request Forgery (CSRF) : An attacker induces the victim to access a third-party website and sends cross-site request to the attacked website. Using the victim in the attacked website has obtained the user credentials, bypassing the background user authentication, to achieve the purpose of impersonating the user to perform a certain operation on the attacked website.

process

  1. Victims log on to the website and retain their login credentials (cookies).
  2. The attacker lured the victim to visit b.com.
  3. B.com sent a request to a.com: a.com/act=xx.
  4. When a.com received the request, it verified it and confirmed that it was the victim’s credentials, mistaking it as a request sent by the victim himself.
  5. A.com enforced act= XX on behalf of the victims.
  6. When the attack is complete, the attacker impersonates the victim without the victim’s knowledge and allows a.com to perform its own defined operation.

Several common types of attacks

CSRF of the GET type

CSRF utilization of the GET type is very simple and requires only one HTTP request. It is typically utilized as follows:

<img src="http://bank.example/withdraw? amount=10000&for=hacker" >Copy the code

Copy the code after the victim to visit the page containing the img, the browser will automatically to http://bank.example/withdraw? Account =xiaoming&amount=10000&for=hacker Sends an HTTP request. Bank.example will receive a cross-domain request containing the victim’s login information.

CSRF of the POST type

This type of CSRF is typically exploited using an auto-submitted form, such as:

<form action="http://bank.example/withdraw" method=POST>
    <input type="hidden" name="account" value="xiaoming" />
    <input type="hidden" name="amount" value="10000" />
    <input type="hidden" name="for" value="hacker" />
</form>
<script> document.forms[0].submit(); </script> 
Copy the code

When you visit the page, the form is automatically submitted, simulating a POST operation.

Post-type attacks are generally a little more stringent than GET, but still not complex. Any personal website, blog, website uploaded by hackers may be the source of attacks, back-end interface can not rely on the security of POST only above.

CSRF of link type

Link-type CSRFS are uncommon and require the user to click a link to trigger them, compared to the other two cases where the user opens the page and is caught. This type usually involves embedding malicious links in the pictures published in the forum, or inducing users to be lured in the form of advertisements. Attackers usually trick users into clicking with exaggerated words, such as:

< a href = "http://test.com/csrf/withdraw.php?amount=1000&for=hacker" taget = "_blank" > big news!! <a/>Copy the code

Since the user logged in to the trusted website A and saved the login status, as long as the user actively accessed the above PHP page, the attack was successful.

The characteristics of CSRF

  • Attacks are generally launched on third party sites, not the site being attacked. The attacked site cannot prevent the attack from happening.
  • Attack using the victim’s login credentials in the attacked website, posing as the victim to submit operations; Instead of stealing data directly.
  • The attacker can not obtain the login credentials of the victim during the whole process, just “fake”.
  • Cross-site requests can be made in a variety of ways: image urls, hyperlinks, CORS, Form submissions, and more. Part of the request can be directly embedded in third-party forums, articles, difficult to track.

CSRF is typically cross-domain because outdomains are usually more easily controlled by attackers. However, if there are easily exploited functions in the local domain, such as forums and comment areas for Posting pictures and links, the attack can be carried out directly in the local domain, and this attack is more dangerous.

Protection strategy

CSRF is usually launched from third-party websites. Attacked websites cannot prevent attacks, but can only enhance their own protection against CSRF to improve security. The above mentioned two characteristics of CSRF are described:

  • CSRF (usually) occurs in third party domain names.
  • CSRF attackers cannot obtain information such as cookies, but only use.

In view of these two points, we can formulate special protection strategies as follows:

  1. Block access to unknown outfields
  • Homologous detection
  • Samesite Cookie
  1. Submit by asking for additional information that is available only to the local domain
  • CSRF Token
  • Double Cookie authentication
Homologous detection

Since most CSRFS come from third-party sites, we simply forbid untrusted domains (using the Referer Header to identify the source domain name) from making requests to us.

According to the HTTP protocol, there is a field in the HTTP header called Referer that records the source address of the HTTP request. For Ajax requests, images, and script resource requests, Referer is the page address that initiates the request. For page jumps, Referer is the address of the previous page that opened the page history. Therefore, we use the Origin section of the link in the Referer to get the source domain name of the request.

This method is not foolproof, the Referer value is provided by the browser, although HTTP protocol has clear requirements, but each browser for the specific implementation of the Referer may be different, there is no guarantee that the browser itself does not have security vulnerabilities. The method of verifying the Referer value relies on the security of a third party (i.e. the browser), which in theory is not very secure. In some cases, attackers can hide or even modify the Referer of their own requests.

In 2014, the W3C’s Web Application Security Working Group published a draft Referrer Policy that specifies in detail how browsers should send referers. Now that most of the newer browsers support this draft, we finally have the flexibility to control the Referer policy for our sites. The new Referrer Policy specifies five Referer policies: No Referrer, No Referrer When downgraded, Origin Only, Origin When cross-origin, and Unsafe URL.

According to the table above, the Referrer Policy should be set to same-origin. For same-origin links and references, the Referer will be sent with the value of Host without Path. Cross-domain access does not carry Referer. For example: Aaa.com references bbb.com resources and does not send Referer.

There are three ways to set the Referrer Policy:

  • In the CSP setting
  • Add meta tags to the header of the page
  • A tag adds the referrerPolicy attribute

This is a lot, but we can see one problem: attackers can hide referers in their requests. If an attacker places his request in this way, the attack initiated by the request will not carry the Referer.

<img src="http://bank.example/withdraw? amount=10000&for=hacker" referrerpolicy="no-referrer">Copy the code
CSRF Token

Another feature of CSRF mentioned above is that attackers cannot directly steal the user’s information (cookies, headers, website content, etc.), but only use the information in cookies.

The CSRF attack succeeds because the server mistook the request sent by the attacker for the user’s own request. Then we can require all user requests to carry a Token that a CSRF attacker cannot obtain. By verifying whether the request carries the correct Token, the server can distinguish the normal request from the attack request and defend against CSRF attacks.

There are two forms of Token authentication (the second one is mostly used) :

  • Add a randomly generated token in the form of parameters in the HTTP request, and establish an interceptor on the server side to verify the token. Assume that there is no token in the request or the token content is incorrect, and reject the request because it may be a CSRF attack.
  • Define and validate your own attributes in the HTTP header: Instead of putting the token in an HTTP request as a parameter, you put it in a self-defined attribute in the HTTP header. With the XMLHttpRequest class, you can add the HTTP header attribute cSRFToken to all of the class requests at once. And put the token value in it. This overcomes the inconvenience of adding tokens to requests in the previous approach. At the same time, addresses requested via XMLHttpRequest are not logged in the browser’s address bar, and tokens are not leaked to other sites via the Referer.