This is the 12th day of my participation in the genwen Challenge

A simple HTML example looks at user information security

Standard HTML syntax supports the use of the tag in the form form to create an HTTP submitted attribute. Modern WEB logins commonly use forms like the following:

<form action = "http://localhost:8080/Application/login" method = "POST"> User name: <input id="username" name="username" type="text"/> Password: <input id="password" name="password" type="password" />
    <button type="submit">landing</button>
</form>
Copy the code

When submitting a request, the form will obtain the attribute of name in the input label of the form, and pass it to the background as a parameter in the BODY of the HTTP request for login verification.

For example, if my account is user1 and password is 123456, I will send the following HTTP request to the background when submitting the login (captured by Chrome or FireFox developer tools, you need to enable the Preserve log) :

You can see that even though the password field is a black dot, the machine still intercepts the request in clear text.

2. HTTP protocol transmission directly exposes the user password field

During network transmission, user information security is directly compromised if a user is detected. The following uses Fiddler or Wireshark as an example to show that the captured HTTP packets contain sensitive information:

3, the use of encryption algorithm can ensure password security?

The WEB front-end can use an algorithm to encrypt the password field and submit the password as the CONTENT of the Http request, including symmetric and asymmetric encryption. ** Symmetric encryption: the use of symmetric cryptographic coding technology, which is characterized by file encryption and decryption using the same key encryption. Asymmetric encryption: requires two keys, a publickey and a privatekey. The public key and private key are a pair. If the public key is used to encrypt data, only the corresponding private key can be used to decrypt data. If data is encrypted with a private key, it can only be decrypted with the corresponding public key. 支那

3.1 Using symmetric Encryption

Encryption and decryption may seem like a good idea after both front and back end negotiations, such as the front end using a simple string shift + string inversion method (for example, of course it’s not that simple). So, if the old password 123456 is shifted first:


123456-->456123
Copy the code

Then reverse:


456123-->321654
Copy the code

So this simple method seems to be able to obfuscate the original password, and easily reverse operation by the background recovery. But this has two drawbacks:

1, the front and back end encryption and decryption need to modify the code at the same time;

2, front-end encryption is nothing more than written in JS, but JS has the risk of being directly cracked to identify the encryption method.

3.2 Is HTTPS necessarily secure for asymmetric Encryption?

Asymmetric encryption has the existence of public and private keys, which can be obtained at will, and the private key is the local storage used to decrypt the public key. The mechanism of public and private keys seems to guarantee the transmission of encryption and HTTPS, which is still in use today, is based on this principle.

But is HTTPS necessarily secure? There are two possible risks to HTTP:

1) HTTPS can ensure that information is not intercepted during transmission. However, if you think about it carefully, HTTPS is an application-layer protocol, and SSL is used at the lower layer to ensure information security. However, ciphertext can also be intercepted at the client and server sides.

2) During HTTPS packet transmission, if the client is maliciously induced to install the man-in-the-middle WEB trust certificate, the man-in-the-middle attack in HTTPS will also leak the plaintext password to others.

4. The conclusion is that, whether HTTP or HTTPS, passwords must be transmitted ciphertext

Consider that HTTPS does not guarantee user password information, so we should consider to continue to protect the password above the application layer, that is, write code to control, and do not depend on a specific protocol, it is easy to think of the use of irreversible encryption hash MD5(string), when users register to enter the password, Stores the MD5(password) value, performs MD5(password) on the WEB, and then transmits the password to the background for comparison with the ciphertext in the database. (PS: MD5 function computes the same value for the same string when the number of digits is specified.) The advantages are obvious:

1) Ensure the security of password information inside the user database;

2) During transmission, the user’s ciphertext will not be cracked into the original password in any case;

3), simple and efficient, not difficult to execute and encode, all kinds of languages provide MD5 support, fast development.

5) That’s great! This will save you money on HTTPS, right?

Back to the example at the beginning: the user enters user1 and password 123456, so the actual HTTP/HTTPS packet sent under any protocol looks like this after MD5 processing:

That’s right. Encrypted login is working.

But while we were celebrating password security, we noticed that the money in our account had suddenly disappeared.

Why is that?

Hackers laugh very happily: because they do not have to get your password in plain text, if directly intercept your password ciphertext, and then send to the server can not log in?

Because the database is not the same as the MD5(password) ciphertext? HTTP requests that are forged can be logged in successfully to grab other data or transfer the balance.

What about this? It’s not that hard. There’s a lot of solutions, right? The principle is similar: the server cache generates a random validation field and sends it to the client. When the client logs in, it passes this field to the server for validation.

5.1 Solution 1: Verification Code

MVC scenarios. The controller encapsulates the Model of the data into the View. This connection mode exists in the Session, allowing access to information in the Session.

Then we can use some open source captcha generation tools, such as Kaptcha in JAVA, to store a captcha value and a captcha generated image on the server, encode the image in Base64, and return it to the View. In the View of decoding Base64 and loading pictures blog.csdn.net/lgh1117/art… And will be compared the next time the user logs in.

5.2 Solution 2: Token

The front and rear ends are separated. The very popular development mode of front and back end separation greatly improves the development efficiency of the project.

The responsibilities and division of labor are clear. However, because HTTP is stateless (that is, this request does not know the content of the last request), when the user logs in, according to the user username as the key, random token (such as UUID) is generated as value cached in Redis, and the token is returned to the client. When the client logs in, the validation is completed and the cached record in Redis is deleted.

Each time the token is retrieved from the server, it is guaranteed that the HTTP request is sent back from the front end, because the token is deleted and reset after each login, causing the hacker to attempt to replay the account and password data to log in and fail.

In short, even if I get the ciphertext of the account and password, I cannot log in, because if the request does not contain the token token of background authentication, it is an illegal request.

It’s not easy! But don’t rejoice too soon, for the data may be tampered with

The password’s encrypted so the hacker can’t see the plain text. With the Token added, the log-in process can no longer be intercepted and replayed.

But think about this situation, you are making online payment on some alipay, which requires four fields: account number, password, amount and token. Then you pay 1 yuan for a bag of small raccoon flat noodles with free mail. After the payment of some alipay, you find that 10,000 yuan has been deducted from your account balance.

How does that work?

Because even if the hacker does not log in, do not operate, still want to do damage: When the request is routed to the hacker side, the packet is intercepted, and then there is no need to log in. Anyway, the account password is correct, and the token is also correct, so it is ok to change the field of the packet, so it is ok to change the money to 10,000, and then send it to the server. As the victim, I somehow step on this hole.

But how to solve this? In fact, the principle is similar to the digital signature mechanism in HTTPS.

6.1 What is a Digital Summary?

When we download files, we often see that some download sites also provide a “digital summary” of the downloaded files, so that downloaders can verify whether the downloaded files are complete, or whether they are “identical” with the files on the server.

In fact, the number is to use the one-way Hash function will need to encrypt plaintext “abstract” into a series of fixed length (128) of the cipher text, the string of cipher text is also known as digital fingerprint, it has a fixed length, and different plaintext into ciphertext, the result is always different, and its the same content information must be consistent.

Therefore, “digital summary” might be more appropriate to be called “digital fingerprint”. “Digital digest” is the fundamental reason HTTPS ensures data integrity and tamper-proof.

6.2 Digital signature — the natural technology

If the sender wants to send a message to the receiver, before sending the message, the sender uses a hash function to generate a message digest from the message text, and then encrypts the digest with its private key. The encrypted digest is sent to the receiver as the “signature” of the message. The receiver first computes the message digest from the received original message using the same hash function as the sender.

The public key of the sender is then used to decrypt the digital signature attached to the message. If the two digests are the same, the receiver can confirm that the message was sent from the sender and has not been omitted or modified! This is what can be done with a combination of “asymmetric key encryption and decryption” and “digital digests”, known as “digital signatures”.

In this process, the process of generating a summary of transmitted data and encrypting it with a private key is the process of generating a “digital signature”, and the encrypted digital summary is a “digital signature”.

Therefore, we can get a field checkCode by signing the username+MD5(password)+token mentioned in the previous case on the WEB side and send the checkCode to the server. According to the checkCode sent by the user and the original data signature, the server performs calculation comparison to confirm whether the data is tampered during the process, so as to maintain the integrity of the data.

7,

Seemingly very simple WEB login, in fact, there are a lot of security risks. These security improvement process is encountered in a real WEB project, the above analysis and evolution is the solution proposed in the inspection of project security, there will be a lot of deficiencies, hope to communicate and discuss together, common progress!

Addendum 1: the JS encryption function has been cracked

Thank you mysgk for pointing out the problem of cracking the JS encryption function in integrity check:

Problem Description:

If a hacker finds an encryption algorithm by reading the front-end JS source code, does that mean he can construct checkCode that can be decrypted by the server to fool the server?

I think, should also be a lot of websites are also taking the strategy:

Abstract or encrypted JS algorithm does not directly exist in the browser in the form of static files, but let the WEB end to request the Server, the Server can decide according to the random token value to return a corresponding random encryption strategy, in the way of JS code response, in the asynchronous request response, load JS abstract algorithm, This allows the client to load the digital digest policy dynamically, making it impossible to copy.

Note 2: MD5 has hidden problems

Thanks to EtherDream for pointing out that MD5 is outdated and insecure:

Problem Description:

MD5, SHA256 is out of date… Now PBKDF, bcrypt are out of date.

1. This article focuses on the introduction of method ideas. It is not necessary to use MD5 function, but other ways can be used.

2, MD5 has hidden dangers, I did not consider too much before, but thank you very much for pointing out that it is true, the main idea is:

For MD5 cracking, actually belong to collision. For example, the original text A can generate abstract M through MD5, we do not need to restore M to A, we just need to find the original text B and generate the same abstract M. If the MD5 hash function is MD5(), MD5(A) = M MD5(B) = M any B is the cracking result. B may or may not be equal to A.

If you intercept the MD5 encrypted ciphertext, you can also find a “pseudo-original text” that is not the original password but can be logged in successfully after encryption.