Why you need HTTPS

In short: THE HTTP protocol is not secure. The plain-text propagation nature of HTTP makes it easy for middlemen to intercept, modify, and forge requests on links. A secure communication protocol should ensure four elements:

  1. Confidentiality: Data can only be accessed by trusted people
  2. Integrity: Data cannot be tampered with during transmission
  3. Identity authentication: Verifies the identities of the communication parties
  4. Undeniable: Record the communication process

To meet these requirements, HTTPS was created.

What is the HTTPS

Hypertext Transfer Protocol Secure (HTTPS) is a transport Protocol for Secure communication over computer networks. The default port number is 443. As you can see from rFC2818’s title, “HTTP Over TLS,” it simply adds another layer of TLS between TCP/IP and HTTP:

Therefore, it can be said that HTTPS security is guaranteed by TLS.

SSL/TLS

SSL, or Secure Socket Layer, is located at Layer 5 (session Layer) in the OSI model. The Internet Engineering group IETF renamed it TLS (Transport Layer Security) in 1999. So TSL1.0 is actually SSLv3.1. The latest version of TLS was 1.3 in 2018, and 1.2 is widely used.

When a browser and a server use TLS to establish a connection, they need to select an appropriate encryption algorithm for secure communication. The combination of these algorithms is called a cipher suit. The basic format is key exchange algorithm + signature algorithm + symmetric encryption algorithm + digest algorithm. For example, ecdhe-RSA-AES256-Gcm-sha384 indicates that the ECDHE algorithm is used for key exchange during handshake, RSA signature and identity authentication are used, AES symmetric algorithm is used for communication after handshake, and the key length is 256 bits. The grouping mode is the GCM digest algorithm SHA384 used for message authentication and generation of random numbers “.

The principle of

Symmetric encryption

Symmetric encryption is, as the name suggests, the same key used for encryption and decryption:

The commonly used algorithms are AES and ChaCha20. The AES key length can be 128, 192, or 256. The ChaCha20 key length is 256. Note that the key length is in bits, not bytes. For example, a key length of 128 is a 16-byte binary string.

Because a key is only used to encrypt fixed length plaintext, symmetric encryption also needs to group the plaintext according to fixed length to achieve the purpose of encrypting arbitrary size plaintext with one key. Common grouping modes include CBC (calculator mode), CTR (ciphertext grouping link mode), and so on.

Symmetric encryption alone is obviously not enough, because even if the plaintext is encrypted, the symmetric key still needs to be transmitted over the network, and if it is intercepted, the encryption process is meaningless.

Asymmetric encryption

Asymmetric encryption, as the name suggests, involves the use of different keys for encryption and decryption:

The public key is public, while the private key is private. After using public key encryption, only the corresponding private key can be used to decrypt. Correspondingly, after the private key is used for encryption, only the corresponding public key can be used for decryption.

Common asymmetric encryption algorithms include RSA, ECC, and DH. RSA security is based on the “integer decomposition problem”, which is generally 1024 bits in length and 2048 bits in important cases. ECC security is based on the “Elliptic curve discrete logarithm problem”, with its subalgorithm ECDHE for key exchange and ESDSA for digital signature. ECC has significant security and performance advantages over RSA, with 160-bit ECC equivalent to 1024-bit RSA. DH algorithm is based on discrete logarithm. It has no parameters such as text and summary, so it can only be used for key exchange.

Mixed encryption

Although asymmetric encryption seems to be perfect, it is not suitable for plain text encryption because of its heavy computation and slow operation speed. Can, therefore, take the “mixed encryption scheme: first use asymmetric encryption algorithm, to obtain public and private keys, symmetric algorithm is generated by random number using the session key, then the session key is encrypted with the public key of backwardness to each other, each other to get the decrypted, after take out the session key, after they have passed the session key symmetric encryption communication. Symmetric encryption keys are very short, which solves the problem of long encryption time of asymmetric keys.

With hybrid encryption, the confidentiality of communication can be ensured, but the integrity is not yet ensured, and there is no identity authentication. Therefore, the following extensions are needed to achieve secure communication.

The algorithm

Algorithm, namely hash function, is a one-way irreversible algorithm, it maps arbitrary length of data into a fixed length and unique string, and it is very sensitive to the change of input, any data modification will change the output result. Therefore, it can be used as a tool to determine whether the data has been modified.

Common digest algorithms are MD5, SHA-1, etc., but they are not enough security intensity has been TLS disabled, the current TLS recommended use of the digest algorithm is SHA-2, which can be divided into six different algorithm standards, including: Sha-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256, the following number refers to the length of the generated digest.

Using the characteristics of the digest algorithm, the plaintext and the digest are encrypted and then sent. After receiving, the receiver calculates the digest of the plaintext and compares it with the digest sent by the sender. If there is no change, the plaintext is not modified, so as to ensure the integrity of communication:

Digital signature and certificate

In asymmetric encryption, the private key can be regarded as a unique identity information because it has its own characteristics. Before sending, the private key is used to encrypt the abstract of plaintext as a digital signature. After receiving the abstract, the receiver uses the public key to decrypt it and verify the integrity of the abstract compared with the original text. Then the identity authentication is realized. Therefore, as long as the communication parties exchange public keys and “sign” and “check” each other, the purpose of identity authentication can be achieved.

However, this method also has a vulnerability, that is, anyone can publish the public key, can not confirm the security of the obtained public key. Therefore, a “third party”, the Certificate Authority (CA), is needed to authenticate the security of the public key.

To apply for a certificate, add the applicant’s information such as who I am, where I’m from, what my name is, and what the certificate is for, and then send it to the CA with your signature. If the CA agrees to issue the application file after receiving the application file, it signs the application file with the CA’s private key. If the CA signs the application file with its own private key to its own public key, it is called self-signing.

The signed certificate contains the basic information about the applicant, the CA, the service life of the certificate, the public key of the applicant, the summary algorithm used for the signature, and the CA signature.

So how do you verify the certificate? Take the browser as an example. After receiving the certificate, the browser checks which CA signed the certificate, finds the corresponding certificate in the trusted CA library, uses the public key of the CA certificate to decrypt the signature on the website certificate, and uses the given digest algorithm of the certificate to calculate the digest of the website certificate. If the certificate is the same, the certificate is valid.

But who guarantees the legality of the CA? Generally, the CA certificate is graded. The certificate of the CA is signed by the CA of the higher level. The highest level is the Root CA certificate, which is a self-signed certificate. Generally, operating systems and browsers have built-in root certificates for each CA.

TLS1.2

handshake

At present, the mainstream TLS handshake process is ECDHE handshake process:

  1. After establishing a TCP connection, the browser sends a Client Hello to the server, The contents include the TLS versions supported by the Client, the list of Cipher Suites supported by the Client, and a Client Random number for subsequent generation of session keys, as shown below:

  1. Upon receipt, the server will package and send four modules at a time:

  • Server Hello: Confirms the TLS Version sent by the client, selects a Cipher Suite from the list of Cipher suites provided by the client as the Cipher Suite for this communication, and returns a Server Random number for subsequent generation of session keys:

  • Certificate: Obviously the Certificate of the server:

  • Server Key Change: Contains the parameters required for Server Key negotiation (Server Params), in this case the public Key of the elliptic curve, and the signature of the entire message.

  • Server Hello Done:

  1. After receiving the certificate, the Client authenticates the certificate and signature of the server. Then, according to the requirements of the cipher suite selected by the server, the Client uses the corresponding algorithm to generate the parameters (Client Params) required for Client key negotiation. Here is the public key of the elliptic curve:

According to the Client Params and Server Params, the pre-master Secret can be calculated through the ECDHE algorithm. Then, the calculated “Pre Master Secret” and the Random numbers “Client Random” and “Server Random” obtained by the exchange are used as parameters of the PRF algorithm to generate the Master key “Master Secret”. Then Master Secret, Client Random, and Server Random are used as parameters of the PRF algorithm to generate a session key, also known as a key block (KEY_block).

There can be multiple session keys. For example, the client uses client_write_key to send messages, and the server uses server_write_key to send messages. Because both parties have already shared the “Master Secret”, both sides can consistently generate multiple keys, using symmetric encryption.

The TCP packet sent by the client to the server contains three modules:

  • Client Key Change: contains Client Params:

  • Change Cipher Spec Protocol: In fact, this record type is not a handshake message. It is simply a notification, telling the other party that all subsequent data will be encrypted. And then conversely, before it, the data is in plain text.

    A Finished message is sent immediately after a change Cipher Spec message to verify that key exchange and authentication are successful

  • Encrypted Hand Shake Message: from the Client Hello to start until (but not including) Finished messages all Shake hands:

Note: If the server previously sent a “Certificate Request” message, the client also needs to send its own Certificate and signature. This TLS process is called bidirectional authentication. However, there are so many clients that it is impractical to issue certificates to each client. Therefore, usually one-way authentication is sufficient, and the server can ask the client for a user name and password to authenticate the identity, which is a familiar pattern.

  1. After receiving this message, the server sends the Change Cipher Spec and Finished messages. Both parties verify that the key exchange and authentication process are successful. The handshake ends.

The flow chart of shaking hands with ECDHE is as follows:

The RSA handshake is slightly different.

  1. With THE RSA encryption suite, the Client can calculate the Pre Master Secret without additional parameters, and then send it to the Server using the Server’s public Key encryption, so the Server does not need to send the Server Key Exchange to complete Key negotiation. In addition, Since there is no Server Key Exchange message, the signature of the RSA handshake will be changed to Server Hello.

The problem with this is that if a hacker gets hold of the server’s private key, he can decrypt the primary key for all previous messages. In contrast, the ECDHE handshake does not directly use the Server’s public key to encrypt data, but exchanges the elliptic curve public keys (Client Parms and Server Params) at both ends to ensure the security of the prepared master key.

  1. After the RSA handshake is used, the client can send HTTP packets only after receiving the Finished message from the server. The ECDHE handshake saves 1RTT of time by sending an HTTP packet before the server has responded. You may ask why this can’t be done with RSA. I don’t know. Look for answers in the comments section.

The flowchart is as follows:

TLS1.3

TLS1.3 provides an update to TLS security that explicitly prohibits compression in recording protocols. Upgrade the pseudo-random number function PRF to HKDF; Many cryptographic algorithms were abolished, leaving only five cryptographic suites:

For backward compatibility, TLS1.3 does not change the existing record format, but adds the Extension Protocol at the end of the record. The old VERSION of TLS does not support the Extension Protocol, so you can simply ignore it. Note that the Version field of the recording protocol is still TLS1.2. You need to add the extension supported_version to the Hello message to indicate that the new Version of the protocol is used.

handshake

Using the extension protocol, HTTPS can be optimized for various purposes. During the TLS1.3 handshake, the Client uses the extension supported_Groups in Client Hello to specify the elliptic curve supported by the file system, and key_SHARE to specify the Client public key parameters corresponding to the curve. Use “signature_ALGORITHMS” with the signature algorithm.

After receiving these extensions, the server selects a curve and parameters, and then returns the public key parameters of the server with the “KEY_share” extension, which realizes the key exchange between the two parties. The rest is the same as WITH TLS1.2.

The flowchart is as follows:

Note that after the master key is calculated, the server immediately sends “Change Cypher Spec” to enable encrypted communication. Therefore, the subsequent information such as the certificate sent to the client is encrypted before being sent, further enhancing security.

HTTPS optimization

First of all, TLS1.3 should be used as far as possible to save handshake time and be more secure. If only TLS1.2 can be used, the ECDHE algorithm for elliptic curves should also be used.

Session reuse can also be used, which is similar to HTTP caching, and is very effective for improving performance.

There are two types of Session multiplexing. The first is Session ID. After the client and server connect for the first time, each store a Session ID number, and store the master key and other related information in memory. When the client sends an ID, the server queries the ID in memory. If the ID is found, the server directly uses the corresponding master key to restore the session state, skipping certificate authentication and key exchange, and establishing secure communication with only one RTT. As shown in figure:

Session ID is the most widely used scheme, but it requires the server to keep Session data for each client, which will undoubtedly burden the server if there is too much data. The second solution, Session Ticket, is similar to HTTP cookies in that it transfers the burden of storing Session data from the server to the client. The server encrypts the Session information and sends a New Session Ticket message to the client. The client saves the message and sends a Ticket to the server using the extended session_ticket. After the server decrypts and authenticates the message, You can resume the session and start encrypted communication.


Flow chart source:

Looking through the HTTP protocol

References:

Looking through the HTTP protocol

Interesting talk about network protocol

HTTPS reviews the past to know the new

Overview of SSL/TLS and certificates

Wikipedia – Block cipher working mode

rfc2818

rfc5246

rfc8422