Let’s learn about HTTPS. First of all, why do you need HTTPS when you have HTTP? It occurred to me, why do we need to answer the standard answer in the interview? Why don’t we say our own thoughts and opinions, but memorize some of the so-called standard answers? Is the technology still correct or not?

Why did HTTPS appear

  • A new technology must be invented to solve some problem, so what problem does HTTPS solve with HTTP?

What problem does HTTPS solve

  • A simple answer might be that HTTP is not secure. Due to the nature of HTTP plaintext transmission, in the process of HTTP transmission, anyone may intercept, modify or forge the request to send, so it can be considered that HTTP is not secure
  • The identity of the communication party is not verified during HTTP transmission, so the two sides of the HTTP message exchange may be masqueraded, that is, there is no user authentication
  • In the process of HTTP transmission, the receiver and sender do not verify the integrity of the packet. To solve the above problems, HTTPS applications are born.

What is the HTTPS

** Do you remember how HTTP is defined? Hypertext Transfer Protocol (HTTP) is a Protocol for the Transfer of text, images, audio, video and other Hypertext data between two points in the computer world.

  • HTTPS stands for Hypertext Transfer Protocol Secure. It is used to Secure communication between two end systems on a computer network. HTTPS is a protocol and specification for the Secure transmission of text, pictures, audio, video and other hypertext data between two points in the computer world.
  • HTTPS is an extension of THE HTTP protocol. It does not guarantee transmission security. Who does? In HTTPS, communication protocols are encrypted using transport Layer Security (TLS) or secure Sockets Layer (SSL). That is HTTP + SSL(TLS) = HTTPS.

What does HTTPS do

The HTTPS protocol provides three key metrics

  • Encryption. HTTPS encrypts data to protect it from eavesdroppers. This means that when a user is browsing a website, no one can listen in on the information exchanged between the user and the website, or track the user’s activities, access history, etc., to steal user information.
  • Data integrity: Data will not be modified by eavesdropping during transmission. The Data sent by the user will be completely transmitted to the server to ensure that the server receives what the user sends.
  • Authentication, which means confirming the true identity of the other party, or proving that you are you (can be likened to face recognition), prevents man-in-the-middle attacks and builds user trust.

With these three key metrics in place, users can safely exchange information with the server. So, given all the benefits of HTTPS, how do I know if a site uses HTTPS or HTTP? Two pictures should explain it.

The HTTPS protocol is actually very simple. The RFC document is very small, only 7 pages short, which specifies the new protocol name, the default port number 443, as for the rest of the reply mode, packet structure, request method, URI, header field, connection management, and so on, all follow HTTP, nothing new.

In other words, HTTPS has the same syntax and semantics as HTTP except for the protocol name and default port number (HTTP default port 80). So how can HTTPS be as secure as HTTP cannot be? The point is that this S is SSL/TLS.

What is the SSL/TLS

Know the SSL/TLS

Transport Layer Security (TLS) is a later version of Secure Socket Layer (SSL). TLS is a protocol used for authentication and encryption between two computers on the Internet.

Note: On the Internet, many names are interchangeable.

  • We all know that one of the most important steps in some online businesses (such as online payments) is to create a trusted transaction environment that allows customers to make transactions in peace of mind. SSL/TLS guarantees this. SSL/TLS works by binding physical information about websites and companies to encryption keys in digital documents called X.509 certificates.
  • Each key pair has a private key and a public key. The private key is unique and resides on the server to decrypt the information encrypted by the public key. The public key is public. Everyone who interacts with the server can hold the public key. Information encrypted with the public key can only be decrypted by the private key.

What is X.509: X.509 is the standard format for public key certificates, a document that securely associates an encryption key with (a person or organization).

X.509 is mainly used as follows

  • SSL/TLS and HTTPS are used for authenticated and encrypted Web browsing
  • E-mail messages signed and encrypted through the S/MIME protocol
  • Code signing: It refers to the process of signing software applications using digital certificates for secure distribution and installation.

By digitally signing the software using a certificate issued by a well-known public certificate authority (such as SSL.com), developers can assure end users that the software they want to install has been released by a known and trusted developer; And the signature has not been tampered with or damaged.

  • Can be used for document signing * also can be used for client authentication * Government issued electronic ID card (see www.ssl.com/article/pki…

We’ll talk about that later.

The kernel of HTTPS is HTTP

HTTPS is not a new application layer protocol, but the HTTP communication interface is replaced by SSL and TLS. Typically, HTTP communicates directly with TCP first. After using HTTPS of SSL, it first evolves to communicate with SSL, and then SSL and TCP communicate. In other words, HTTPS is HTTP with SSL on top. (I like to save the fans for last…)

SSL is an independent protocol. Not only HTTP but also other application-layer protocols, such as SMTP(Email protocol) and Telnet(remote login protocol) can be used.

To explore the HTTPS

I mean, what are you doing with your fancy name? HTTPS: TLS/SSL HTTPS: TLS/SSL HTTPS: TLS/SSL HTTPS: TLS/SSL

SSL, or Secure Sockets Layer, is the fifth layer of the OSI seven-layer network model. SSL was renamed TLS, or Transport Security Layer, by the Internet Engineering Group (IETF) in 1999. Currently the most widely used version is 1.2, so the following discussion is based on TLS 1.2.

TLS is used to provide confidentiality and data integrity between two communication applications. TLS by record deal, handshake protocol, warning, and change the password specification agreement, extension of several sub agreement, such as comprehensive use of the symmetric encryption, asymmetric encryption, identity authentication and many other cryptography frontier technology (a technology is very simple, if you feel that you just didn’t study in place, any technology is a beauty, people just appreciate that the cow force, Not to disparage).

We have not yet seen the TLS naming convention. Here is a TLS example to see the TLS structure (see www.iana.org/assignments…

```
ECDHE-ECDSA-AES256-GCM-SHA384
```

What does that mean? The TLS cipher suite is quite standard. The basic format is a password string composed of key exchange algorithm – signature algorithm – symmetric encryption algorithm – digest algorithm, and sometimes there is a grouping mode. Let’s take a look at the meaning of just now

ECDHE is used for key exchange, ECDSA is used for signature and authentication, THEN AES is used for symmetric encryption algorithm, key length is 256 bits, GCM is used for grouping mode, and FINALLY SHA384 is used for digest algorithm.

TLS fundamentally uses two forms of symmetric encryption and asymmetric encryption.

Symmetric encryption

Before we look at symmetric encryption, let’s look at cryptography. In cryptography, there are several concepts: plaintext, ciphertext, encryption, and decryption

  • A Plaintext, generally considered to be a meaningful set of characters or bits, or a message that can be obtained by some public encoding. Plaintext is usually denoted by m or P
  • Ciphertext, the encryption of plain text, becomes Ciphertext
  • Encrypt: The process of converting plain information into ciphertext
  • Decrypt, the process of restoring encrypted information to plain text.

Symmetrical Encryption, as its name suggests, means that the same key is used for both Encryption and decryption. As long as the security of the key is guaranteed, the whole communication process is confidential.

  • TLS has many encryption algorithms available for use, such as DES, 3DES, AES, ChaCha20, TDEA, Blowfish, RC2, RC4, RC5, IDEA, SKIPJACK, etc. The most commonly used ones are AES-128, AES-192, AES-256, and ChaCha20.

  • DES stands for Data Encryption Standard. It is a symmetric key algorithm for digital Data Encryption. Although its short key length of 56 bits makes it too insecure for modern applications, it has been very influential in the development of encryption technology.

  • 3DES is an encryption algorithm derived from the original Data Encryption Standard (DES). It became very important after the 1990s, but later, due to the emergence of more advanced algorithms, 3DES became no longer important.

  • Aes-128, AES-192 and AES-256 all belong to AES. AES stands for Advanced Encryption Standard. It is an alternative to DES algorithm with high security and high performance. It is the most widely used symmetric encryption algorithm.

  • ChaCha20 is another encryption algorithm designed by Google. The key length is fixed at 256 bits. The performance of pure software is better than AES. But it’s still a pretty good algorithm.

(Others can be searched by yourself)

Grouping encryption

  • Symmetric encryption algorithms also have the concept of grouping mode. For GCM grouping mode, it is only used in conjunction with AES, CAMELLIA, and ARIA. AES is by far the most popular and widely deployed choice, allowing the algorithm to encrypt plaintext of any length with a fixed-length key.

  • At the earliest, there were several grouping modes such as ECB, CBC, CFB and OFB, but all of them have been found to have security vulnerabilities, so they are not used much now. The latest grouping scheme, called Authenticated Encryption with Associated Data (AEAD), adds authentication functionality to Encryption. GCM, CCM and Poly1305 are commonly used.

  • For example, ECDHE_ECDSA_AES128_GCM_SHA256 indicates a 128-bit key. AES256 indicates a 256-bit key. GCM represents the Associated Data Encryption (AEAD) mode of operation for modern authentication with 128-bit block ciphers.

  • We talked about symmetric encryption above. In symmetric encryption, both the encryptor and the decryptor use the same key. That is, the encryptor must encrypt the raw data and then give the key to the decryptor for decryption before it can decrypt the data. This is like “soldier Zhang Ga” to send a letter (the letter has been encrypted), but the ga zi also took the decryption password, the ga zi if the devil found on the way, the letter can be completely exposed. So symmetric encryption is risky.

Asymmetric encryption

Asymmetrical Encryption, also known as public key Encryption, is a new and improved Encryption method compared with symmetric Encryption. The key is transmitted and exchanged over the network, which ensures that the key is intercepted and data information is not exposed. Asymmetric encryption has two keys, a public key and a private key. The public key encrypts and the private key decrypts. The public key can be used by anyone, but the private key can only be known to you.

  • The text encrypted with the public key can be decrypted only with the private key, and the text encrypted with the private key can also be decrypted with the public key. Public keys do not need to be secure because they need to be transferred between networks. Asymmetric encryption can solve the problem of key exchange. The website keeps the private key and distributes the public key freely on the Internet. You want to log in to the website as long as it is encrypted with the public key, and the ciphertext can only be decrypted by the private key holder. Hackers can’t crack the ciphertext because they don’t have a private key.

  • Asymmetric encryption algorithm design is much more difficult than symmetric algorithm (we will not discuss the specific encryption method), such as DH, DSA, RSA, ECC and so on.

  • The RSA encryption algorithm is the most important and famous one. Such as DHE_RSA_CAMELLIA128_GCM_SHA256. Its security is based on integer decomposition. The product of two super-large prime numbers is used as the material to generate the key. It is very difficult to deduce the private key from the public key.

  • ECC (Elliptic Curve Cryptography) is also a kind of asymmetric encryption algorithm. Based on the mathematical problem of Elliptic Curve discrete logarithms, ECC generates public and private keys using specific Curve equations and basis points. ECDHE is used for key exchange, and ECDSA is used for digital signature.

  • TLS uses a mixture of symmetric and asymmetric encryption to achieve confidentiality.

Mixed encryption

RSA is very slow and AES is very fast, which is the hybrid encryption used in TLS. At the beginning of communication, asymmetric algorithms such as RSA and ECDHE are used to solve the key exchange problem first. Then generate the session key used by the symmetric algorithm with random number, and encrypt it with public key. The peer party decrypts the ciphertext with the private key and extracts the session key. In this way, the secure exchange of symmetric keys is realized.

Now that we’re using hybrid encryption to achieve confidentiality, can we safely transmit data? Not enough, on the basis of confidentiality, but also add integrity, identity authentication features, to achieve real security. And the main way to do that is Digest Algorithm

The algorithm

  • How do you achieve integrity? In TLS, the main means to achieve integrity is the Digest Algorithm. Message Digest Algorithm 5 (MD5) is a cryptographic hash Algorithm. MD5 can be used to create 128-bit string values from strings of arbitrary length.
  • Despite its insecurity, MD5 is still in use today. MD5 is most commonly used to verify file integrity. However, it is also used in other security protocols and applications, such as SSH, SSL, and IPSec. Some applications enhance the MD5 algorithm by salting the plaintext or applying hash functions more than once.

What is salt? In cryptography, salt is a random piece of data that is used as an additional input to hash data, a cipher, or a one-way function of a cipher. Salt is used to protect passwords in storage. For example,

What is one-way? This algorithm has no key to decrypt, only one-way encryption, encrypted data can not be decrypted, can not reverse the original text.

Going back to the digest algorithm, you can actually think of the digest algorithm as a special compression algorithm, which can compress arbitrary length of data into a fixed length string, which is like putting a lock on the data.

  • In addition to MD5, Secure Hash Algorithm 1(SHA-1) is also a common encryption Algorithm. However, SHA-1 is also an insecure encryption Algorithm and is prohibited in TLS. The current TLS recommendation is sha-2, the successor to SHA-1.

  • The sha-2 series contains six Hash functions with abstracts of 224, 256, 384, or 512 bits: Sha-224, SHA-256, SHA-384, SHA-512. Can generate 28 byte, 32 byte, 48 byte, 64 byte digest respectively.

  • Sha-2 protects data integrity. If you change a punctuation mark or add a space in a file, the resulting summary will be completely different. However, sha-2 is plaintext based encryption and still not secure enough.

  • The more secure encryption method is HMAC. Before you can understand HMAC, you need to know what a MAC is.

  • Message Authentication Code, the full name of MAC, is generated from the message and the key through the MAC algorithm. The MAC value allows the verifier (who also has the secret key) to detect any changes in the message content, thus protecting the data integrity of the message.

  • HMAC is a further extension of MAC. It is a combination of MAC values and Hash values. Any encrypted Hash function, such as SHA-256, can be used in HMAC calculation.

Now that we’ve solved the integrity problem, there’s only one problem left, which is authentication. How does authentication work? In the process of sending data to the server, a hacker (attacker) could pose as either party to steal information. It can masquerade as you and send messages to the server, or it can masquerade as a server and receive messages from you. So how to solve this problem?

certification

How do you determine your own uniqueness? The concept of public key encryption and private key decryption came up in the above description. The private key mentioned only you have a person, can identify the uniqueness, so we can switch the order, into private key encryption, public key decryption. Using the private key and the digest algorithm, digital signature can be realized and authentication can be realized.

Up to now, the comprehensive use of symmetric encryption, asymmetric encryption and digest algorithm, we have achieved encryption, data authentication, authentication, so is it safe? No, there is also a digital signature authentication problem. Because the private key is its own and the public key can be published by anyone, an authenticated public key must be published to solve the public key trust problem.

Therefore, CA is introduced, and the full name of CA is Certificate Authority. You must let the CA issue a certified public key to solve the trust problem of public keys.

There are only a few CA with certification in the world, which have issued DV, OV and EV respectively. The difference lies in the degree of credibility. DV is the lowest, which is only trusted at the domain name level, while EV is the highest. After strict verification by law and audit, the identity of the website owner can be proved (the name of the company will be displayed in the browser address bar, such as Apple and GitHub websites). Agencies with different trust levels come together to form a hierarchy.

  • Typically, an applicant for a digital certificate will generate a key pair consisting of a private and public key and a certificate signing request (CSR). The CSR is an encoded text file that contains the public key and other information (such as domain name, organization, E-mail address, etc.) that will be included in the certificate.
  • Key pair and CSR generation is typically done on the server where the certificate will be installed, and the type of information contained in the CSR depends on the level of verification of the certificate. Unlike public keys, an applicant’s private key is secure and should never be shown to the CA (or anyone else).

After the CSR is generated, the applicant sends it to the CA, which verifies that the information it contains is correct and, if correct, digitally signs the certificate using the issued private key and sends it to the applicant.

conclusion

This article focuses on why HTTPS is invented, what problems HTTPS solves with HTTP, what is the relationship between HTTPS and HTTP, what are TLS and SSL, and what problems TLS and SSL solve? How to achieve a truly secure data transfer?

The article is not easy, if you like this article, or help you hope that you like to forward attention oh. The article will be updated continuously. Absolutely dry!!