Everything is connected, every day we can’t do without the network, for programmers the network is the foundation, how do we consolidate this foundation? Need to read a lot of books, learn the network we should first of all to the four layers of protocol to have a clear understanding, and then need to master some basic capture and anti-capture package policy reading, of course, commonly used encryption and decryption is also need to master, good words not to say, let us open the network journey

1. Network stratification

Why network layering?

Because of the instability of the network

In a broad sense: The network can be divided into seven layers, but we will study four layers here:

1.1 Application Layer Application Layer

1.2 HTTP

1.2.1 the HTTP version

1.2.1.1 HTTP / 1

A. The connection cannot be reused

With HTTP/1.0 data, the connection needs to be re-established each time, adding latency.

B. HOLB (queue head blocking)
  • HTTP 1.0: The next request must be sent after the previous one has returned. Request-response pairs occur sequentially. Obviously, if a request is not returned for a long time, all subsequent requests are blocked.

  • HTTP 1.1: Try pipeling, which allows browsers to make multiple requests (from the same domain, over the same TCP connection) at once. Pipeling requires sequential returns, so if the first request is time-consuming (such as processing a large image), subsequent requests will wait for the first request to be processed. So Pipeling only partially solves HOLB.

1.2.1.2 HTTP / 2

1.2.1.2.1 Binary Transmission

HTTP/2 splits the request and response data into smaller frames, and they are binary encoded

1.2.1.2.2 Multiplexing

Multiplexing solves pain points

This eliminates the problem of the browser limiting the number of requests to the same domain name. It also makes it easier to achieve full speed transmission, since a new TCP connection requires a slow increase in transmission speed.

Multiplexing features
  1. All communication with domain names is done on a single connection.
  2. A single connection can host any number of two-way data streams.
  3. The data stream is sent as a message, which in turn consists of one or more frames that can be sent out of order because they can be reassembled according to the stream identifier at the head of the frame.
Advantages of multiplexing
  1. The same domain name occupies only one TCP connection. Multiple requests and responses are sent in parallel using one connection, eliminating the delay and memory consumption caused by multiple TCP connections.
  2. Multiple requests are sent in parallel and interleaved, without affecting each other.
  3. Multiple responses are sent interleaved in parallel and do not interfere with each other.
  4. In HTTP/2, each request can have a priority value of 31 bits. 0 indicates the highest priority. A higher value indicates a lower priority. With this priority value, clients and servers can take different policies when dealing with different streams to optimally send streams, messages, and frames.

1.2.1.2.3 Header compression

  • In cases where headers carry cookies, you may need to repeat hundreds to thousands of bytes at a time to reduce resource consumption and improve performance
    1. Use “header tables” to track and store previously sent key-value pairs, instead of sending the same data through each request and response;
    2. The header table exists throughout the lifetime of the HTTP/2 connection and is gradually updated by both the client and the server.
    3. Each new head key-value pair is either appended to the end of the current table or replaces the previous value in the table

1.2.1.2.4 cache push

Push the necessary resources to the client in advance so that latency is relatively reduced. Of course you can use Prefetch if the browser is compatible.

1.2.1.3 HTTP / 3

Based on UDP QUIC

1.2.2 HTTP defines

  1. A network transport protocol at the application layer, the top layer of the TCP/IP protocol family
  2. Hypertext Transfer Protocol, along with HTML (Hypertext Markup Language), is used to request and Transfer HTML content over the network.
  3. Hypertext, or extended text, refers to HTML that can have hyperlinks to other text.

1.2.3 HTTP headers

Function of 1.2.3.1 Header

Metadata for HTTP messages

1.2.3.2 Header

1.2.3.3 Host

  • The target host
  • Note: not for addressing on the network, but for locating sub-servers on the target server

1.2.3.3.1 Content – Length

Specify the length of the Body in bytes.

1.2.3.3.2 Content-Type

1.2.3.3.3 content-type concept

Specify the type of Body

1.2.3.3.4 content-type classification
  1. text/html

The request Web page is the type of response returned, with HTML text returned in the Body. The format is as follows: 2. x-www-form-urlencoded 3. multipart/form-data 4. application/json , image/jpeg , application/zip 5. Accept

The type of data that the client can accept. Such as text/HTML 6. Accept – Charset

The character set accepted by the client. Such as utf-8 7. Accept – Encoding

The type of compression encoding accepted by the client. Such as gzip 8. The Content – Encoding

Compression type. Such as gzip

1.2.4 HTTP Status code

  • 1xx

Temporary message. For example :100 (continue to send), 101(switching protocol)

  • 2xx

Success. The most typical are 200(OK), 201(created successfully)

  • 3xx

Redirection. For example, 301(permanently moved), 302(temporarily moved), 304(unchanged)

  • 4xx

The client is faulty. Such as 400(client request error), 401(authentication failure), 403(forbidden), 404(unable to find content)

  • 5xx

Server error. Such as 500(server internal error)

1.2.5 HTTP proxy

A transfer station during HTTP transmission can accelerate cache and load balancing.

1.2.6 cache-control HTTP

1.2.6.1 cache-control role

Cache data at clients or intermediate network nodes to reduce the frequency of fetching data from servers and improve network performance.

1.2.6.2 Properties of cache-Control Field

  • max-age
    • No_store does not allow caching
    • No_cache Can be cached
    • Must -revalidate If the cache does not expire, you can continue to use it, but if you want to use it after the expiration, you must verify with the server

1.2.6.3 Verifying Resource Failure Using Conditional Request

  • Common types
    • if-Modified-Since
    • If-None-Match
  • You can reuse resources in the cache when you receive 304

1.2.6.4 Verifying whether a resource is modified

  • Common conditions
    • Last-modified
    • ETag
  • The server needs to set it in the response message in advance and use it with conditional requests

1.2.6.5 How to Refresh Data on a Browser

  • You can send the cache-control field
    • Use “max-age=0” or “no_cache”

1.2.7 How Do I Transfer Large Files through Http

1.2.7.1 Data Compression

  • Accept-Encoding

    • type
      • gzip

      • deflate
      • br
    • disadvantages
      • Generally, only text files have a good compression rate, while images, audio and video and other multimedia data are already highly compressed, which will not be reduced (and may even increase a little) by gZIP processing.
  • Content-Encoding

1.2.7.2 Block Transmission

  • type
    • Transfer-Encoding: chunked
  • disadvantages
    • If a large file can’t be made smaller as a whole, “take it apart,” breaking it up into smaller pieces that are sent to the browser in batches, where they can be assembled and reassembled. In this way, browsers and servers do not have to store the entire file in memory, and only send and receive a small part of the file at a time. The network is not occupied by large files for a long time, and resources such as memory and bandwidth are saved.
  • Matters needing attention
    • Transfer-encoding :** chunked “and” Content-Length “are mutually exclusive **, that is, the two fields cannot appear at the same time in the response message. The transmission Length of a response message is either known or unknown.

1.2.7.3 Multi-segment Data

  • The MIME type

  • “Multipart/byteranges”

  • It indicates that the body of the message is composed of a multi-segment byte sequence, and a parameter “Boundary = XXX” ** is also used to give the separation mark between segments.

1.2.8 Login Authorization

1.2.8.1 cookies

1.2.8.1.1 Cookie working mechanism
  1. If the server returns the content that the client needs to save by putting it in set-cookie Headers, the client automatically saves the content.

2. Cookies saved by the client will be carried into the Cookie header in all subsequent requests and sent back to the server.3. Cookies saved by clients are classified according to server domain names. For example, cookies saved by Shop.com will not be carried in subsequent requests to Games.com.4. Cookies saved by the client will be deleted after timeout. Cookies without timeout duration (called session cookies) will be automatically deleted after the browser closes. In addition, the server can actively delete client Cookies that have not expired.

1.2.8.1.2 How do I implement a Cookie Manager?

1.2.8.1.3 Cookie Persistent Storage

If cookies only exist in memory, all cookies will disappear after the App is closed. What we expect is that the next time we open the app, we expect to still automatically log in to the main page, which requires us to persist cookies at the file level. However, okhttp. cookie has a nasty twist; He didn’t implement Serivalziable, so we had to implement serialization ourselves

1.2.8.1.4 Cookie support and call mechanism of OKHttp

OKHttp does not have cookies by default when cookies are not set, so when we need to customize the CookieJar, we take the Cookie from the CookieJar and load it

When the request is retrieved, erver’s Cookie is stored locally in HttpEngine using saveFromResponse

The important thing is: the checksum storage mechanism for cookies, first we need to set up a special class responsible for managing cookies. This class is obviously not the implementation class of CookieJar. We specially set up a CookieMannanger with secondary storage functions, namely file storage and memory storage. If the app is closed, the memory storage will be cleared, and the cookies used will be stored in the local file. When the app starts, it loads the Cookie from the file

1.2.8.1.5 WebView and OkHttp realize Session sharing

The server knows the connected client through Session, so it needs a set of mechanism shared by okHttp and Cookie, which can realize the synchronous login with the native APP on the H5 page. By reading the source code, we know that okHttp sets cookies through cookieJar In the OKhttpClient Builder method, the default CookieJar is an empty object with no cookies set

Therefore, we only need to manually obtain the webView’s Cookie and set it on the okHttp CookieJar to achieve Cookie sharing. The engine of WebView is WebKit. In Webkit, Cookie manaager manages cookies

Create the custom cookieJar in the OkHttpClient utility class

1.2.8.1.6 Cookie function
Session management: Login status, shopping cart

Personalization: user preferences, themes

Tracking: Analyzing user behavior

1.2.8.1.7 How do I Ensure Cookie Security
  • XSS (Cross-site Scripting) : Cross-site scripting attacks. That is to use JavaScript to get the browser Cookie, sent to their own website, in this way to steal users’ cookies.
    • Solution 1: When the Server sends a Cookie, the sensitive Cookie is added with HttpOnly.
    • Solution 2: HttpOnly — This Cookie can only be used for HTTP requests and cannot be called by JavaScript. It prevents native code from abusing cookies.

1.2.8.2 Authorization

1.2.8.3 Authorization Implementation mode
Pattern 1: Basic
  • Authorization: Basic USERNAME: Password (Base64ed)
Method 2: Bearer
  • Format: Authorization: Bearer
  • The token can be obtained through the authorization process of OAuth2
1.2.8.4 Authorization process

  1. The third-party App applies for third-party authorization cooperation from Tencent and gets client_ID and client_secret
  2. When users use the third-party App, click “Login through wechat”, the third-party App will use the wechat SDK to jump to wechat and pass in its client_id as its identity identifier
  3. Wechat interacts with the server to get the third-party App’s information and restrict it to the interface, and then asks the user if they agree to authorize the App to log in using wechat
  4. After the user clicks “Login with wechat”, wechat and the server will interact to submit the authorization information, and then jump back to the third-party App and pass in Authorization_code as the credential approved by the user
  5. The third-party App calls the “wechat Login” Api of its own server, passes in Authorization_code, and then waits for the response from the server
  6. After receiving the login request, the server sends the received Authorization_code to the third-party authorization interface of wechat, and sends the Authorization_code and its client_secret together as parameters. After wechat passes the verification, Returns the access_token
  7. After receiving the access_token, the server immediately sends a request to the user information interface of wechat with the Access_token. After wechat passes the verification, the server returns the user information
  8. After receiving the user information, the server creates an account for the user in its own database, and uses the user information from the wechat server to fill in its own database, and associates the user ID with the user’s wechat ID
  9. After the user is created, the server sends a response to the client’s request, sending back the newly created user information
  10. The client receives the response from the server, and the user successfully logs in and uses Bearer_token in his own App. Some apps will design the login and authorization process like OAuth2 in the Api design, but simplify the authorization_code concept. That is, an access_token is returned upon a successful login request, and the client can then use this Access_Token as a Bearer_token in subsequent requests.

Why does OAuth introduce the Authorization code and require the third party applying for Authorization to send the Authorization code back to its own server and then obtain access token from the server, instead of directly returning access token? What’s the point of such a complicated process?

For safety. OAuth does not force the authorization process to use HTTPS, so it is necessary to ensure high enough security when there is an eavesdropper in the communication path.

1.2.8.5 How Do I Refresh a Token

usage

The access_token has an expiration time. After it expires, the refresh Token interface is called and refresh_token is passed to obtain a new Access_token.

purpose

Security. When an Access_token is stolen, the bad guys have only a short time to “do bad things” because it has an expiration date. At the same time, because (in the standard OAuth2 process) the Refresh Token only exists on servers with third-party services, there is little risk of the Refresh Token being stolen.

Line implements three-party login authorization

Build the authorized entity class

Provides authorization callback

Construct LineSDK callback result data

1.3 the HTTPS

HTTPS definition

  1. Short for HTTP over SSL, HTTP works over SSL (or TLS). Basically, HTTP for encrypted communication.
  2. HTTPS changes the underlying protocol from TCP/IP to SSL/TLS

HTTPS background

  1. The websites that support HTTPS are basically certificates issued by CA organizations and can be trusted by default.
  2. Use keytool to generate a certificate and then use it, not issued by the CA. Websites that use self-signed certificates generally report risk warnings when they are visited by browsers

The HTTPS certificate

A digital certificate is the basis for secure transmission through HTTPS. It is issued by an authoritative CA. The main contents of a certificate include public key, certificate authority, certificate holder, certificate validity period, signature algorithm, fingerprint and fingerprint algorithm

HTTPS Certificate Content

You can see that the public key is a very long 204.bits string, and you can also see that the user content includes some websites, followed by the issuer, expiration date, signature algorithm, and of course fingerprint and fingerprint algorithm

Fingerprint and signature of the certificate

Signature A character string is added to the end of a certificate to prove that the information has not been modified. The CA encrypts the fingerprint and fingerprint algorithm of the certificate by using the private key

Scenarios where you need to write your own certificate verification process
Fingerprint and signature of HTTPS certificate
HTTPS certificate verification request process

HTTPS Certificate Type
  • CA
  • X.509
HTTPS Certificate Function
  • Authenticate users and servers to ensure that data is properly sent to customers and servers (validate certificates)
  • Encrypt data to prevent it from being stolen along the way (encryption)
  • Maintain data integrity and ensure that data is not changed during transmission
How do I customize the HTTPS certificate

HTTPS SSL/TLS

The concept of SSL/TLS

SSL/TLS is an authoritative standard in the field of information security, using a variety of advanced encryption technologies to ensure communication security

SSL/TLS
  • The TLS 1.3

TLS 1.3 optimization space compared to TLS 1.2

  • Explicitly disallow compression in recording protocols;
  • AES and ChaCha20 symmetric encryption algorithms are retained
  • Abolish ECB, CBC and other traditional grouping modes; Only AEAD GCM, CCM and Poly1305 can be used for grouping mode
  • MD5, SHA1 and SHA-224 digest algorithms were abolished. The algorithm can only use SHA256, SHA384,
  • The RSA and DH key exchange algorithms and many named curves are abolished. The key exchange algorithms are only ECDHE and DHE
  • RC4 and DES symmetric encryption algorithms were abolished. Elliptic curves have also been “chopped” to only 5 types: P-256 and X25519

HTTPS Authentication

HTTPS unidirectional authentication

  • (1) The client initiates an HTTPS request to send the SSL version information to the server.
  • (2) The server goes to the CA organization to apply for a CA certificate. As mentioned above, the certificate contains the server’s public key and signature. The CA certificate is sent to the client
  • (3) The client reads the plaintext information of the CA certificate, uses the same hash function to calculate a hash digest (hash purpose: to prevent the content from being modified), and then decrypts the signature with the CA’s public key (because the signature is encrypted with the CA’s private key), and compares the information digest in the certificate. If yes, the certificate is trusted and the server public key is extracted
  • (4) The client generates a random number (key F), encrypts this random number with the server B_ public key, and sends it to the server.
  • (5) The server uses its own B_ private key to decrypt the ciphertext, and obtains the key F
  • (6) The server and client use this key F to communicate in the subsequent communication process. Instead of asymmetric encryption, this is symmetric encryption from the start
HTTPS bidirectional authentication

  • (1) The client initiates an HTTPS request to send the SSL version information to the server.
  • (2) The server goes to the CA organization to apply for a CA certificate. As mentioned above, the certificate contains the server’s public key and signature. The CA certificate is sent to the client
  • (3) The client reads the plaintext information of the CA certificate, uses the same hash function to calculate a hash digest (hash purpose: to prevent the content from being modified), and then decrypts the signature with the CA’s public key (because the signature is encrypted with the CA’s private key), and compares the information digest in the certificate. If yes, the certificate is trusted and the server public key is extracted

– (4) The client sends its own client certificate to the server. The certificate contains the C_ public key of the client

  • (5) The client sends the supported symmetric encryption scheme to the server for selection
  • (6) After the server chooses the encryption scheme, it uses the C_ public key obtained just now to encrypt the selected encryption scheme
  • (7) the client uses its own C_ private key to decrypt the selected encryption scheme, the client generates a random number (key F), and uses the server B_ public key to encrypt this random number to form ciphertext, which is sent to the server.
  • (8) The server and client use this key F to communicate in the subsequent communication process. Instead of asymmetric encryption, this is symmetric encryption from the start
HTTPS Connection Establishment Process (Working Principle)

A symmetric key is negotiated between the client and server to encrypt each message before sending it and decrypt it after receiving it

  1. Client Hello
  2. Server Hello
  3. The server certificate trust is established
  4. Pre-master Secret
  5. Client notification: Encrypted communication will be used
  6. Sent by the client :Finished
  7. Server notification: Encrypted communication will be used
  8. The server sent the Finished message
HTTPS optimization

  • Protocol optimization

Try to use TLS1.3, it greatly simplifies the handshake process, complete handshake only 1-RTT, and more secure

  • Certificate of optimization
    • You can choose an ELLIPtic curve (ECDSA) certificate instead of an RSA certificate because the 224-bit ECC is equivalent to the 2048-bit RSA, so the elliptic curve certificate is much smaller than RSA
    • Enable OCSP Stapling on the server to prevent clients from accessing the CA to verify certificates
    • Session multiplexing has the same effect as Cache. If the client successfully establishes a connection, you can use credentials such as Session ID and Session Ticket to skip key exchange and certificate authentication and directly start encrypted communication
HTTPS ensures communication security
  • confidentiality
    • It refers to the “secrecy” of data, which can only be accessed by trusted people and is not visible to others. It simply means that people who are not involved should not be able to see things.
  • integrity
    • The data is not tampered with during transmission, no more, no less, and remains intact.
  • The identity authentication
    • Verify the person’s real identity and ensure that messages are sent only to trusted people
  • There is no denying that
    • Ensure the authenticity of transaction communication

1.3 Transport Layer Transport Layer

TCP

  • Four times to wave

Purpose of wave:

Four packets are sent to dismantle a TCP connection

What is a TCP connection?

The two parties confirm that they can communicate with each other and will not discard each other’s messages. That is, the two parties establish a connection.

Waving process:

  • Host A will send A packet segment (serial number: SEq = P, flag bit: FIN = 1) and then enter fin-WaIT-1 state. After sending flag bit: FIN = 1, it means that I want to close the connection. No more data will be transmitted, but host A can still receive data from host B.

  • After receiving the FIN packet from host A, host B sends an ACKNOWLEDGEMENT packet with ACK bit = 1 and ACK sequence number = P +1. Then, host B enters the close-wait state. This process indicates that host B has received the closing request from the closing party.

  • When all data of host B is sent, host B sends A packet to release the connection to host A. Host B sends A FIN packet segment with FIN = 1 flag bit, SEq = Q sequence number, p+1 confirm sequence number, ACK = 1 flag bit, and then enters the last-ACK state.

  • After receiving the CLOSED packet segment, host A will send A confirmation packet segment with flag bit ACK = 1, sequence number ACK = Q +1, and sequence number SEq = U +1, and then enter time-wait state. In this case, TCP connection is not released, and it will WAIT for 2∗MSL(the longest packet segment life) before entering the CLOSED state.

Three-way handshake

Shake hands with purpose

Connect to a specified port on the server, establish a TCP connection, synchronize the serial number and confirmation number of the two connected parties, and exchange TCP window size information

handshake

  • The Client sends a syn = 1 packet indicating the server port to which the Client intends to connect, and Seq =X, in the Sequence Number field of the packet header.

  • The server sends back an acknowledgement (ACK) reply. When SYN= 1 and ACK= 1, the Acknowledgement Number is set to the ISN = X+1 of the customer.

  • Clint sends ACK again. The SYN flag bit is 0 and ACK=1. In addition, the serial number field of ACK sent by the server +1 is sent to the other party in the confirm field. And put the ISN+1 in the data segment

Why make long connections

The mobile network does not exist on the Internet but on the Intranet of the carrier and does not have a real public IP address. Therefore, when a TCP connection does not communicate with the carrier for a period of time, the gateway closes the connection channel between the TCP connection and the public network for the sake of network performance. The TCP port cannot receive external communication messages, that is, the TCP connection is closed.

Long connection implementation mode

The heartbeat. That is, a TCP connection is used to send super short meaningless messages at a certain interval so that the gateway cannot define itself as an “idle connection”, thus preventing the gateway from closing its own connection.

TCP Packet Format

UDP

1.4 Internet Layer Network Layer

  • IP

  • IPv4
  • IPv6

1.5 Link Layer Data Link Layer

  • Wi-Fi
  • Ethernet