HTTPS is a secure communication protocol based on cryptography. Strictly speaking, it is based on the combination of HTTP and SSL/TLS. Before you understand HTTPS, you need to understand some basic concepts related to cryptography, such as plaintext, ciphertext, password, key, symmetric encryption, asymmetric encryption, message digest, digital signature, and digital certificate. Next, I will explain these terms one by one. The “data” and “message” mentioned in the article are the same concept, indicating the content carrier of communication between users. In addition, the following roles are mentioned in the article:

  • Alice: Message sender
  • Bob: Message receiver
  • Attacker: An intermediate Attacker
  • Trent: Third-party certification bodies

password

The term “password” in cryptography is not the same as the password used to log in to a website, which translates as “password” and is a set of text strings used for authentication purposes.

Cipher in cryptography is a set of algorithms used to encrypt and decrypt messages. The process from plain text to ciphertext is called encryption, and the process from ciphertext to plain text is called decryption. The combination of encryption algorithm and decryption algorithm is called cipher algorithm.

The key

A key is a parameter entered during the use of a cryptographic algorithm. The same plain text will generate different ciphertext under the same cipher algorithm and different key calculation. Many well-known cryptographic algorithms are public. The key is an important parameter to determine whether the ciphertext is secure. Generally, the longer the key is, the more difficult it is to crack. According to the method of using the key, the password can be classified into symmetric encryption and public key encryption.

Symmetric encryption

Symmetric key algorithm is also called shared key encryption. The same key is used for encryption and decryption. Common symmetric encryption algorithms include DES, 3DES, AES, RC5, and RC6. Symmetric key has the advantage of fast calculation speed, but it has the disadvantage that the receiver needs to be informed by the sender to decrypt the key, so how to safely send the key to the receiver becomes a problem.

When Alice sends data to Bob, it encrypts the data symmetrically and then sends it to Bob. As the data is encrypted in the sending process, it cannot be broken even if someone steals the data, because it does not know what the key is. However, the same problem is that Bob is also at a loss after receiving the data, because he does not know what the key is, so can Alice send both the data and the key to Bob? Of course not. Once you send the key with the key, it’s no different from sending the plaintext, because once someone gets the key and the data at the same time, the ciphertext is broken. So key matching for symmetric encryption is a problem. Public key encryption is one solution.

Public key encryption

Public-key cryptography (PUBLIC-key cryptography) is a set of cryptographic algorithms consisting of paired key pairs, including encryption keys and decryption keys. The sender encrypts with an encryption key and the receiver decrypts with a decryption key. The encryption key is made public and can be obtained by anyone. Therefore, the encryption key is also called a public key. The decryption key cannot be made public and can only be used by yourself. Common public key encryption algorithms are RSA.

Again, take Alice sending data to Bob as an example. The public key encryption algorithm is initiated by the receiver Bob

  1. Bob generates a public and private key pair. The private key is kept for himself and cannot be disclosed to anyone.
  2. Bob sends the public key to Alice, and it doesn’t matter if someone steals it
  3. Alice encrypts the data with the public key and sends it to Bob. It doesn’t matter if someone steals the data in the process, because it can’t be decrypted without a paired private key
  4. Bob decrypts it with the paired private key.

Although public-key encryption solves the problem of key distribution, you can’t confirm whether the public key is legitimate or not. You can’t be sure that Bob sent the public key, because there may be a man-in-the-middle attack in the process of Bob sending the public key to Alice, replacing the real public key. Alice doesn’t know anything about it. Another drawback is that it runs much slower than symmetric encryption.

The message digest

The Message Digest function is an algorithm used to determine data integrity. It is also called a hash function or hash function. The value returned by the function is called a hash value, which is also called a message digest or fingerprint. This algorithm is irreversible, so you can’t use the message digest to work out what the message is. So it’s also called a one-way hash function. How do you know if you’re downloading the full version of the software, or if a middleman has embedded a virus in the software? So we can use the message hash function is used, and generate the hash value, usually software provider will provide software download address and the hash value, after users to download the software on the local with the same hash algorithm to calculate the hash value, compared with the hash value of the official offer, if the same, that the software is completed, or is being modified. Common hash algorithms include MD5 and SHA.

When you download Eclipse, the official web site provides both the software address and a message summary

Hash function can guarantee the integrity of the data, identify whether data has been tampered with, but it does not recognize the data is in disguise, because an intermediary can be replaced at the same time, the data and news although data is complete, but the real data nestlings, the receiver is not received by the sender to send, but an intermediary. Message authentication is the solution to data authenticity. Authentication techniques include message authentication code and digital signature.

Message authentication code

Message Authentication Code (MAC) is a technology that can verify the integrity of a message and authenticate it (message authentication refers to confirming that the message is from the correct sender). Message authentication code can be simply understood as a one-way hash function associated with a key.

Before Alice sends a message to Bob, she sends the shared key to Bob. Alice calculates the message and sends the message together with the MAC value to Bob. After Bob receives the message and the MAC value, he compares it with the MAC value calculated locally. This means that the message is complete, and it can be confirmed that Alice sent it, and no middleman forged it. However, the message authentication code will also encounter the key distribution problem of symmetric encryption, so public key encryption is used to solve the key distribution problem.

In addition, the message authentication code has another unsolvable problem. Although Bob can identify the tampering and masquerading of the message, Alice can deny the Attacker by saying, “I did not send the message and I believe Bob’s key was stolen by the Attacker. This could have been the Attacker”. Alice: Well, you can’t really argue with that, so how do you prevent Alice from not admitting that digital signatures can be done?

A digital signature

Alice sent an email to Bob to borrow 10K of money because the email could be tampered with (changed to 100K), could be forged (Alice didn’t send the email but called Alice a fake who emailed her), and Could call Alice a liar after she loaned the money (I didn’t borrow it, I didn’t sign it).

Message authentication code can solve the problem of tampering and forgery. When Alice does not admit that she borrowed money, Bob goes to a third party to do justice. Even then, the justice party can not determine whether Alice really borrowed money, because they share the key, which means that both of them can calculate the correct MAC value. “Obviously your message and MAC value are the same as the MAC value generated by myself, so it must be you who sent the message”, Alice said, “You disclosed the key to someone else, he sent the email, you can go to him”. Alice denied it.

Digital Signature can solve the problem of denial. When sending a message, Alice and Bob use different keys and reverse the public key encryption algorithm. The sender Alice uses the private key to sign the message, and only Alice with the private key can sign the message. Bob uses the paired public key to verify the signature, and the third-party organization can also verify the signature with the public key. If the verification is successful, it means that the message must be sent by Alice, and denial is not acceptable, because only Alice can generate the signature. This prevents denial problems.

It goes like this:

Step 1: The sender Alice processes the message hash function to generate the message digest. The digest information is encrypted with the private key and then the signature is generated and sent to the receiver Bob together with the message.

The second step: The data is transmitted through the network. After Bob receives the data, the signature and message are extracted respectively.

Step 3: Verify the signature. The verification process is to extract the message and perform the same Hash processing to obtain the message digest, and then decrypt the signature sent by Alice with the public key. If the two are equal, the signature verification is successful; otherwise, the verification fails, and it is not sent by Alice.

Public key certificate

Public key cryptography plays an important role in digital signature technology, but how to ensure that the public key is legitimate? What if it is changed under man-in-the-middle attack? The public key should be managed by a third party Authority called the Certification Authority (CA), which collects personal information such as the user’s name, organization, and email address, as well as the public key. And the CA provides the digital signature to generate the public-key Certificate PKC, or Certificate for short.

When Alice sends a message to Bob, it is the encrypted data provided by the public key provided by Bob. However, the public key obtained by Alice is not directly given by Bob, but entrusted to a trusted third party organization.

  1. Bob generates a key pair. The private key is kept by Bob, and the public key is delivered to Trent.
  2. Trent confirmed that the public key was Bob’s after a series of rigorous checks
  3. Trent also generates his own key pair in advance, digitally signing Bob’s public key with his own private key and generating a digital certificate. The certificate contains Bob’s public key. The public key doesn’t need to be encrypted here, because anyone who gets Bob’s public key is fine, as long as they’re sure it’s Bob’s public key.
  4. Alice obtains the certificate provided by Trent.
  5. Alice uses the public key provided by Trent to verify the signature of the certificate. If the signature verification succeeds, the public key in the certificate is Bob’s.
  6. Alice can then use the public key provided by Bob to encrypt the message and send it to Bob.
  7. Bob receives the ciphertext and decrypts it with the paired private key.

At this point, a relatively perfect data transmission scheme is completed. HTTPS (SSL/TLS) is built on top of this process.

This article is about a programmer’s micro blog (id:VTtalk). It is about Python. It is about Python.

A programmer’s micro site