Why write this article

Recently, it was revealed that Github was recording passwords in plain text in its internal log.

Although the impact is said to be small, network and data security issues are once again on the table. Most users’ common passwords are only a few, once hackers get to other websites, “hit the library”, may cause users’ property damage.

This article mainly introduces how to encrypt the transmission and storage of user passwords, and explains the related principles.

The encrypted

There are two main ways of encryption: symmetric encryption and asymmetric encryption.

Symmetric encryption

Symmetric encryption: The same secret key is used for encryption and decryption.

The symmetric encryption mode is:

  • Party A selects a certain encryption rule and encrypts the information
  • Party B shall use the same rules to decrypt the information

The communication between client and server uses symmetric encryption. If only one secret key is used, it is easy to crack. If different secret keys are used each time, the management and transmission cost of massive secret keys will be relatively high.

Asymmetric encryption

Asymmetric encryption: Two keys are required for encryption and decryption, the public key and the private key.

The mode of asymmetric encryption is:

  • Party B generates two keys (public key and private key). The public key is public and accessible to anyone, while the private key is private
  • Party A obtains Party B’s public key and uses it to encrypt the information
  • Party B obtains the encrypted information and decrypts it with the private key.

Even if the hacker gets the public key, there is no way to decrypt it without the private key. Regardless of the rainbow table, it is possible to use a pair of secret keys for a long time.

RSA

The most classical asymmetric encryption algorithm is RSA algorithm.

RSA public key encryption algorithm was proposed in 1977 by Ron Rivest, Adi Shamir and Leonard Adleman. Public and private keys come in pairs, with one encrypted only to be decrypted by the other, usually with the public key encrypted and the private key decrypted.

Why is RSA capable of asymmetric encryption?

Co-prime: Two positive integers are said to be co-prime if they have no common factor other than 1

In simple terms, RSA exploits the principle that it is extremely difficult to factor if the product of two positive integers of a mutually prime relationship is large enough (the longest RSA key ever cracked is 768 bits, whereas the normal key is at least 1024 bits).

Through a certain operation, a calculation result and product as a public key, another calculation result and product as a private key, that is, the public key can be used to encrypt, and the private key is used to decrypt. Specific mathematical formula derivation and proof can refer to the PRINCIPLE of RSA algorithm.

Login to Github

With that said, let’s take a look at how Github transmits passwords.

Looking at the request of the login request, it is found that the password is transmitted in plain text……

So, is this transmission safe? It’s ok because YOU’re using HTTPS, but it’s not secure enough.

HTTP and HTTPS

In a conventional HTTP request, where all information is transmitted in plain text, there are three major risks as long as a middleman hijks at any stage of the link:

  • Eavesdropping risk: Third parties can learn the content of communications.
  • Tampering risk: Third parties may modify the content of communications.
  • Impersonation risk (pretending) : Third parties may impersonate others to participate in communications.

How to solve these problems? Using HTTPS.

HTTPS can be thought of as HTTP + TLS TLS is a transport layer encryption protocol. It is the predecessor of SSL. Unless otherwise specified, SSL and TLS refer to the same protocol.

The SSL/TLS protocol is designed to address the three major risks of HTTP and is intended to:

  • Content encryption. All information is transmitted encrypted, so third parties cannot eavesdrop.
  • Identity authentication. Equipped with an identity certificate to prevent identity from being impersonated. Even if it is hijacked by DNS to a third party site, it will also alert users to the possibility of hijacking
  • Data integrity. Prevent content from being impersonated or tampered with by third parties. With verification mechanism, once tampered, communication parties will immediately find.

With all that said, what does HTTPS do? Use the following flowchart to illustrate what happens during an HTTPS request:

  1. The client initiates an HTTPS request

  2. Server Configuration

Generally need to apply for A certificate to authority (also can make your own, this will be in the middle attack after said, the difference is their needs certificate issued by client-side validation, can continue to access and use A trusted company apply for the certificate is not prompt), the certificate will be generated RSA encryption USES A pair of public and private key B.

  1. Send the certificate

The main content of the certificate is public key A. The certificate also contains other information, such as the certificate authority, expiration time, and so on.

  1. The client parses the certificate

The TLS function of the client is to verify whether public key A is valid, such as the authority and expiration time. If an exception is found, A warning box is displayed, indicating that the certificate has A problem. If the certificate is fine, a random value is generated. Then it enters the asymmetric encryption process, first using the certificate to encrypt the random value.

  1. Transmit encrypted information

This part of the transmission is the random value encrypted with the certificate, the purpose is to let the server get this random value, all subsequent data can use this random value (that is, private key C), symmetric encryption and decryption.

  1. The server decrypts the information

After decrypting with private key B, the server obtains private key C sent by the client, ending the RSA asymmetric encryption process.

  1. The encrypted information is transmitted

The server encrypts the information using private key C.

  1. The client decrypts information

The client uses the previously generated private key C to decrypt the message from the server, and then obtains the decrypted content. There’s nothing a third party can do to listen in on the data.

Man-in-the-middle attack (MITM)

The process above, it seems impeccable, right? No, because people are the most vulnerable part of the security system.

The security of HTTPS information is completely based on the trustworthiness of the certificate. What if the middleman forges the certificate?

Hacking their fake certificates require client authentication through the, can continue to access, as long as the client authentication through, then the public key, A private key and A private key C B is transparent for hackers, also have data safe, so hackers want to induce the users to install his fake certificate can, for example, using A variety of fishing site.

So even if you use HTTPS to transmit plaintext passwords, it’s not absolutely secure. So how do you keep your passwords safe?

How to log in baidu

Look at the baidu login request found that the password is encrypted:

How is it encrypted? We found one key request:

This means that the password is encrypted and decrypted using RSA. So what’s the process like?

Check github, keyword RSA, STAR, the most number of JavaScript library JSencrypt, surprised to find that Baidu login encryption method and the use of function names are consistent with this library, then we can be bold to assume that Baidu’s entire login request process and this open source library is basically the same? So what’s the process of JSENCRYPT?

Encrypted storage

At this point, the encrypted transmission process is complete, now the server has received the user’s real password (decrypted), how to store the password?

Never store passwords in plain text

If passwords are stored in plain text (either in a database or in a log), the risk mentioned at the beginning can occur if the data is compromised and all users’ passwords are exposed to hackers, making the effort to encrypt and transmit passwords meaningless.

Hash the password

One way encryption algorithm: can only generate a corresponding hash value from the plaintext, can not be reversed according to the hash value to the corresponding plaintext.

Several unidirectional hashing algorithms are commonly used to encrypt passwords.

The algorithms commonly used for encryption are the MD5 and SHA series (e.g. SHA1, SHA256, SHA384, SHA512, etc.).

Although hashing can improve the security of password storage, it is not secure enough.

Usually, after a hacker breaks into a database of passwords, he guesses a password at random, generating a hash. If the hash value exists in the database, he guessed a user’s password correctly. If he doesn’t guess the correct password, it doesn’t matter, he can guess the next password at random and try again.

In fact, in order to improve the efficiency of password cracking, hackers will calculate the hash value of a large number of passwords, and the corresponding hash value is stored in a table (often called the rainbow table), in order to crack the password to match the rainbow table. As a result, it is now virtually effortless for hackers to crack passwords encrypted only by hashing algorithms.

Add “salt” to improve safety

Salt: a random string. Adding salt to a plaintext password is to concatenate the plaintext password with a random string.

To prevent hackers from using rainbow tables to crack passwords, we can add salt to the plaintext passwords first, and then hash the salted passwords. Since salt is also used in password verification, the salt and password hashes are usually stored together.

Use a salt hashing algorithm to encrypt passwords, making sure to add a random unique salt to each password, rather than having all passwords share the same salt.

While the salting algorithm works well against the rainbow table hack, it’s not very secure because it takes so little time to compute the hash that a hacker can still use the exhaustive method to crack it, but with a bit more time.

Increase the difficulty of cracking with BCrypt or PBKDF2

To deal with brute force, we need very time-consuming hashing algorithms rather than very efficient ones. BCrypt and PBKDF2 algorithms emerged.

The biggest feature of these two algorithms is that we can set the number of repeated calculations by parameter. The more the number of repeated calculations, the longer the time will be. If it takes a second or more to calculate a hash, it will be almost impossible for hackers to crack passwords profiteering. It takes 11.5 days to crack a six-digit, pure password, let alone a high-security one.

conclusion

If we want to keep users’ information as safe as possible, we need to do the following

  • Using HTTPS requests
  • Use RSA to encrypt passwords and transmit data
  • Use BCrypt or PBKDF2 to encrypt one-way and store

The above.