There are dry goods, more stories, wechat search [programming refers to north] pay attention to this different programmer, waiting for you to hit ~

preface

It was an ordinary weekend in 2020. Xiao Bei was at home surfing B station and watching her favorite up main video.

My girlfriend, who was playing on her cell phone, suddenly asked, “Do you know what digital certificates are for? Why does the browser say certificates are not trusted?”

You say this, that I can energize, so from encryption, digital signature has been talking about digital certificates… Finally put his girlfriend to sleep, write this article alone.

The body of the

If you can clearly answer the following questions, you can go to the bottom of the list and give me a thumbs up. Spend your time with your girlfriend.

  1. In asymmetric encryption, both public and private keys can be encrypted, so when to use the public key to encrypt, and when to use the private key “encrypt”?

  2. What is a digital signature and what does a digital signature do?

  3. Why sign a summary of the data instead of computing the digital signature of the original data directly?

  4. What are digital certificates and what problems do they solve?

This article mainly focuses on the principles of digital signature and digital certificate and their functions.

Try to make sure that students who do not have any basic knowledge of cryptography can understand, so I need to align some cryptographic-related concepts here.

1. What is encryption

Encryption is to process plaintext data according to a special algorithm to make it an unreadable piece of code, commonly called “ciphertext”. Ciphertext is decrypted using the “key” to restore the original plaintext. In this way, the data can be protected from being stolen and read by unauthorized persons.

Easy definition, right? Let’s take a look and consider which of the following are encryption methods:

  • AES
  • RSA
  • MD5
  • BASE64
  • SM4

These are commonly used data encoding technologies in daily development, but only AES, RSA and SM4 can be regarded as encryption methods.

Why is that? An easy way to tell the difference is to see if the encoded data can be restored, and what can be restored is encryption.

MD5 is actually a lossy compression of data. No matter how long the data is, 1KB, 1Mb or 1G, a fixed 128-bit hash value will be generated. Moreover, IT is theoretically impossible for MD5 to restore the encoded data, that is, irreversible.

MD5 is widely used in file integrity verification, password encryption and digital signature due to its irreversibility and unidirectional constancy (the same data is computed with the same value for many times).

Whether BASE64 counts as an encryption method is open to question. There is no conclusion here, because BASE64 encoding does not require a key, and the encoded string can be decoded by anyone, so it is generally not considered as an encryption method. BASE64 is commonly used for transcoding, converting a sequence of binary bytes into a sequence of ASCII characters.

2. Classification of encryption algorithms

Encryption algorithms can be classified into the following types based on whether the keys used for encryption and decryption are the same:

  • Symmetric Cryptography
  • Asymmetric Cryptography

1. Symmetric encryption

Symmetric encryption means that encryption and decryption use the same key.

2. Asymmetric encryption

Asymmetric encryption refers to encryption and decryption using different keys, which are called public key and private key respectively.

Public keys are available to all, while private keys need to be kept secret.

Data encrypted with a public key can only be decrypted with a private key:

Similarly, data “encrypted” by the private key can only be “decrypted” by the public key:

Did you notice that I put the private key “encrypt” in quotes? Why?

Because the private key is not used for encryption, the accurate expression should be “private key signature, public key verification”.

Many students have a misconception that public and private keys can be used for encryption.

Actually not, as for why, I will explain the signature later.

3. The story begins

To tell the story, Xiao Bei has invited Alice and Bob, an academic couple commonly used in cryptography, and Eve, the eavesdropper’s representative.

Let’s start with the story of Alice and Bob dating, and talk about the hidden dangers and how they are resolved.

3.1 First Round

It was a dark and windy night in September. Bob wanted to take Alice out on a date, so he sent her an email:

But we all know that networks are untrustworthy, and because messages are transmitted in clear text on networks, hackers can easily intercept, tamper with, and even impersonate Bob.

Here’s how Eve the hacker does it:

See, Eve easily gets hold of the email (eavesdropping), modifies the email (tampering), and even says he can impersonate Bob and send Alice emails (masquerading) at any time.

If the content forged by Eve in the above picture is received by Alice, the consequences can be imagined.

In the real world, we use the Internet every day to chat, send money and browse non-existent websites.

If all the data is transmitted in plaintext like this, it is obviously insecure.

3.2 Second Round

Since we can’t transmit in plaintext, Bob and Alice should discuss the key in advance and use symmetric encryption to encrypt the email content

Now Bob sends emails encrypted and transmitted using a key he and Alice agreed on in advance.

Without the key, Eve could not capture the email’s contents, tamper with it, and impersonate Bob.

Because the tampered data must be encrypted again with the key before Alice can decrypt it correctly.

So as long as Bob and Alice can ensure that the key is not leaked, the whole communication is secure.

If the key is leaked and intercepted by an intermediary, it is equivalent to plaintext communication.

So we can’t trust people with our safety.

And there’s a problem: if two people can’t meet offline, how can they exchange keys safely online?

This seems to be unsolvable, because when exchanging keys we must communicate in clear text, otherwise the other party will not be able to understand it at all. But cleartext exchange means possible disclosure.

But don’t forget that there’s another good thing in our cryptography toolbox — asymmetric encryption.

Bob and Alice each generate a pair of public and private keys, and since the public key is inherently public, that is, accessible to anyone, it can be exchanged in plaintext over the network.

The email is encrypted using the public key and sent to the recipient. The recipient can decrypt the email using his/her private key. Perfect ~

3.3 Round 3

Let’s see how Bob can send a message to Alice in an asymmetric encryption system.

First of all, Alice needs to form a pair of public and private keys. The private key can only be known to Alice, while the public key can be known to anyone. Therefore, the public key can be directly sent to Bob, even if it is intercepted.

Bob uses Alice’s public key to encrypt the email, and the encrypted content can only be decrypted by Alice’s private key, so it is useless for Eve to intercept it.

On the other hand, if Alice wants to reply to Bob, it needs to encrypt it with Bob’s public key and send it.

This solves the key exchange problem and ensures that the contents of the email are not leaked. Which means we can now protect against eavesdropping.

3.4 How can I prove that Bob is Bob

I don’t know if you’ve noticed, but there’s another problem:

Eve can also use Alice’s public key to send emails to Alice as Bob, because Alice’s public key is public and anyone can access it.

Since Eve also has access to Alice’s public key, there is no way to prevent Eve from forgery and tampering, and for Alice, she cannot tell whether the email is from Eve or Bob.

So the question is essentially “how does Alice confirm that the email is from Bob?”

So how do we do this in life?

That is to have Bob sign his name on a piece of paper and press his hand print, because fingerprints and handwriting are Bob’s own and difficult for others to forge.

So we need to introduce a similar mechanism in computers:

A unique token that only Bob can generate, and that others can verify that the token really belongs to Bob.

This is the topic we are going to talk about today — “digital signature”.

Remember what was unique to Bob?

Yes, it is Bob’s own private key. Bob calculates a “signature” for the email content with his private key, and sends the “signature” and the email content together. The recipient Alice can verify whether the signature is correct by using Bob’s public key, which is called “signature check”.

If the signature is not computed by Bob’s private key, Alice will fail to check the signature with Bob’s public key.

As you can see, Eve tries to use her private key to calculate the signature and send it to Alice, but Alice will fail to check the signature using Bob’s public key!

So could Eve have tampered with the content and faked Bob’s signature? That’s impossible! Because when the content changes, the corresponding signature also needs to be recalculated, and the generation of the signature depends on the private key, as long as Bob’s private key is not leaked, the signature can not be impersonated.

Oh what? What do you think happens if the private key gets leaked? Then forget I said……

So using digital signatures, we can identify the sender of the message, which means that hackers can’t disguise the sender and send data, and they can’t tamper with it.

Note:

It can be seen that the data here is transmitted in plaintext, so there is a risk of eavesdropping. But we have deliberately omitted the confidentiality mechanism to explain how the digital signature mechanism works.

In order to ensure the confidentiality of data, it is common practice that the communication parties securely exchange symmetric encryption keys through asymmetric encryption, and symmetric encryption is used to ensure the confidentiality of data in subsequent communication.

And the function of “signature” itself is not to ensure the confidentiality of data, but to verify the source of data to prevent data tampering, that is, to confirm the identity of the sender.

In general, we don’t calculate digital signatures directly on the data itself. Why?

Because digital signature is asymmetric encryption, asymmetric encryption relies on complex mathematical operations, such as multiplication and modulus of large numbers, which takes a long time.

It takes time to compute a digital signature when the data is large. Therefore, Hash the original data first. The Hash value is called digest.

A “summary”, like a fingerprint, can represent a person, and as long as the content changes, the calculated summary should also change.

The digest should be irreversible. Generally, MD5 is used as the Hash function, and the output of MD5 is fixed 128 bits.

Why is “summary” best irreversible?

Since Alice can use Bob’s public key to unlock the signature, in theory someone like Eve can use Bob’s public key to unlock the signature and get the data.

Therefore, it is better for us to sign the “abstract” of the data, so that even if Eve unlocks the signature, she still gets the “abstract”. If the abstract is irreversible, that is, the original text cannot be deduced from the abstract, it also achieves the function of confidentiality.

The sender uses the private key pair digest to compute the digital signature. So how does the recipient verify?

After receiving it, the recipient Alice removes the digital signature and decrypts it with Bob’s public key to obtain “Abstract 1”, which proves that it is indeed sent by Bob.

(Voice-over: If there is an error verifying the signature using Bob’s public key, the signature must not have been generated by Bob’s private key.)

Then use the same hash function to calculate “Abstract 2” for the email content and compare it with “Abstract 1” obtained above. If the two are consistent, the information has not been tampered.

This is a two-step process to verify the identity of the sender and ensure that the data has not been tampered with.

3.5 Is that enough?

Bob and Alice can now rely on symmetric encryption to communicate confidentially, as well as digital signatures to verify that messages were sent by each other.

But this is all based on the fact that Alice’s public key is indeed Bob’s, and vice versa.

What does that mean?

Just think,If Eve sends her public key to Alice posing as Bob, and Alice saves it, then all messages sent by Bob in the future will fail to verify the signature and be regarded as impostor.So why, you might ask, could Eve send her public key to Alice without Alice knowing?

Look! We’re back where we started, only this time it’s the public key that’s been tampered with, before it was the message itself.

Because Bob’s public key is directly sent to Alice through the network, Eve can tamper with this step and send her public key to Alice posing as Bob. In other words, the step of sending public key is not done:

  • tamper-proof
  • The pretend to be

How does anti-tamper and anti-impersonation work?

As we talked about earlier, it is by digital signature! However, digital signature can be checked only when the receiver holds the sender’s public key.

We are now dealing with the distribution of public keys, so……. A deadlock. It’s kind of a chicken-and-egg situation

The problem is Bob can’t prove he’s Bob.

This is not deja vu, I used to go to work when often asked to provide “my mother is my mother” proof of this kind. But when we say “my mother is my mother”, people won’t believe us at all. We need a credible third party to issue proof, such as the police station.

So “how can Alice be sure that the public key Bob sends to herself is indeed Bob’s and has not been tampered with?”

You can’t verify it with just Alice and Bob.

Bob’s public key is Bob’s public key

3.6 Digital Certificate

In order to solve this problem, “digital certificate” was introduced, what is the digital certificate?

Baidu Baike:

Digital certificate refers to a digital authentication that marks the identity information of each party in Internet communication. People can use it to identify each other on the Internet.

Therefore, digital certificates are also called digital identifiers. Digital certificates encrypt or decrypt the information and data that network users communicate to ensure the integrity and security of information and data.

Read this description, is it still a feeling of clouds and fog, or I use plain words to say ~

As long as you understand digital signatures, you can understand digital certificates here, because I call digital certificates “digital signatures for public keys.”

Why is that? The purpose of introducing digital certificates is to ensure that the public key is not tampered with and can be identified even if it is tampered with.

The tamper-proof method is digital signature, but this signature can not be done by ourselves, because our public key has not been distributed, others can not verify.

Therefore, we can only find a trusted third party to sign for us, that is, a certificate issuing Authority (CA). CA will sign the information about the issuing authority, validity period, public key, and owner (Subject) of the certificate with its private key.

And putting the signature results with this information is called a “digital certificate.”

So Bob can go to CA and apply for a certificate, and then send Alice his certificate. How does Alice verify that the certificate is Bob’s certificate?

Of course, the CA’s public key is used for verification.

Note:

The CA’s public key also needs to be distributed using certificates, so Alice’s computer must install the CA certificate, which contains the CA’s public key.

After receiving the digital certificate sent by Bob, Alice uses the CA’s public key to verify. If the verification passes, it proves that this is indeed Bob’s certificate, and Bob’s public key contained in the certificate can be used to communicate according to the process discussed above.

Can Eve tamper with Bob’s certificate in mid-stream?

The answer is no, because the certificate information is signed using the CA’s private key, and any change in Eve’s Bit will cause the final signature verification to fail.

Can Eve change the certificate information and recalculate the certificate’s digital signature itself?

Nor, because the digital signature calculation of the certificate relies on the CA’s private key, which Eve cannot access.

If so, what does that mean? The whole world is not to be trusted.

3.7 What is a Digital Certificate like

Here is my own certificate on my computer:

As you can see, it contains the certificate holder’s public key and the certificate’s signature.

In addition, the CA has a hierarchical relationship. The certificate of the lower-level CA must be signed by the upper-level CA.

In other words, there must be a root certification authority, so who signed their certificate?

The answer is to sign, to their own certification.

This is a self-signed root certification authority on my computer:

Why can root certificates be self-signed? Who guarantees security?

Do you worry if you put your money in the bank? We trust banks on the basis of our trust in our country, which is the basis of the chain of trust! We think the problem should be stratified, if you do not recognize a unified basis, has been set baby, then the problem will not be solved.

Then there is the question of how to ensure the reliability of the root certificate. This is pre-installed with the operating system and browser, and the root certificate is selected by the operating system vendors such as Microsoft and Apple.

3.8 Is the Certificate Not Trusted?

So when does the browser say “certificate not trusted”?

Based on our analysis above, here are the possible reasons:

  1. The certificate is not issued by an authoritative CA

Some enterprises use pirated certificates without CA certification in order to be cheap. That is, the CA public key built into the browser cannot be used for authentication.

  1. Certificate expired

As mentioned above, one of the terms of the certificate is the period of validity, usually one or two years. If the certificate expires, the browser will say “certificate cannot be trusted.”

  1. Certificate deployment error

There may be a server certificate deployment error, such as a certificate does not match the domain name, because there is an entry in the certificate for the owner information.

Well, it’s safe for Bob to send Alice an invitation to the mangroves

The appendix

QA

Now let’s answer some of the questions posed at the beginning of this article:

  1. In asymmetric encryption, both public and private keys can be encrypted, so when to use the public key to encrypt, and when to use the private key “encrypt”?

    • Encryption scene, then certainly hope that only I can decrypt, others can only encrypt. That is, public key encryption, private key decryption.

    • In the signature scenario, since it is a signature, I hope that only I can sign and others can verify. That is, private key signature and public key check

  2. What is a digital signature and what does a digital signature do?

    • A digital signature is a data digest signed with a private key and sent along with the data.

    • It can play the role of anti-tamper, anti-fake outfit and anti-denial.

  3. Why sign a summary of the data instead of computing the digital signature of the original data directly?

    • The data may be large, and the signature uses an asymmetric encryption algorithm, which is time-consuming
    • Prevent the third party from using the public key to unlock the signature and obtain the original data
  4. What is a digital certificate, and what problem does the existence of a digital certificate solve?

    • The CA uses its private key to authenticate the public key of the certificate applicant.

    • Digital certificates solve the problem of how to distribute public keys safely and lay the foundation of trust chain.

omg

If you think it is good, you can help xiaobei by clicking “follow” and clicking “like”

Your three is the biggest motivation for my creation!

I am xiaobei, the universe is uncertain, you and I are dark horse, let’s see you next time!

This article is constantly updated, and the full text is published on my personal public account. You can search “Programming refers to North” on wechat and read it for the first time. [PDF] There are hundreds of classic computer books I have collected.