This is the seventh day of my participation in the August More text Challenge. For details, see: August More Text Challenge

This paper briefly introduces the concept of HTTPS protocol, and how HTTPS can ensure the security of data encryption, and the difference between it and HTTP protocol!

Before that we looked atThe HTTP protocolIn the HTTP protocol, there may be security problems such as information eavesdropping or identity camouflage. Using HTTPS communication mechanism can effectively prevent these problems! Let’s simply learn the HTTPS protocol!

1. Disadvantages of HTTP

HTTP mainly has these shortcomings, for example:

  1. Communication is in plain text (not encrypted) and the content may be eavesdropped;
  2. The identity of the communication party is not verified, so it is possible to encounter camouflage;
  3. The integrity of the message cannot be proved, so it may have been tampered with.

1.1 Communication using plaintext may be eavesdropped

Since HTTP itself does not have the encryption function, the communication as a whole (the content of requests and responses that are communicated using HTTP) cannot be encrypted. That is, HTTP packets are sent in plain text (unencrypted packets).

  1. TCP/IP is a network that can be tapped
    1. According to the working mechanism of TCP/IP protocol family, communication content is likely to be peeped on all communication lines.
    2. The So-called Internet is made up of networks that can connect to the whole world. No matter which corner of the world server and client communication, in this communication line of some network equipment, optical cable, computer can not be a personal private property, so do not rule out a link will be malicious snooping behavior.
    3. Even encrypted communications can be picked up, just like unencrypted communications. If the communication is encrypted, the meaning of the message may not be decrypted, but the encrypted message itself can still be seen.
    4. It is not difficult to eavesdrop on communications on the same segment. You just collect the data packets (frames) that are flowing over the Internet. The collected packets can be resolved by Packet Capture or Sniffer tools.
  2. Encrypted processing to prevent eavesdropping
    1. One way is to encrypt communications. There is no encryption mechanism in HTTP, but you can use a combination of SSL (Secure Sockets Layer) or TLS (Secure Layer Transport Protocol) to encrypt HTTP traffic. Once a secure communication line is established with SSL, HTTP traffic can be conducted over that line. HTTP used in combination with SSL is called HTTP Secure (HYPERtext Transfer Security Protocol) or HTTP over SSL.
    2. There is also a way to encrypt the content itself that participates in the communication. Since THERE is no encryption mechanism in HTTP, the HTTP transmission itself is encrypted. That is, the content contained in THE HTTP packet is encrypted. In this case, the client needs to encrypt the HTTP packet before sending the request. However, the content may still be tampered with.

1.2 If you do not verify the identity of the communicating party, you may encounter disguise

Requests and responses in THE HTTP protocol do not acknowledge the communicating party. In other words, there are questions like “Is the server really the host specified by the URI in the sending request, and does the response really go back to the client that actually made the request?”

  1. Anyone can make a request
    1. In HTTP communication, anyone can initiate a request because there is no processing step to acknowledge the communicating party. In addition, whenever the server receives a request, no matter who it is, it will return a response (but only if the IP address and port number of the sender are not restricted by the Web server)!
    2. The implementation of THE HTTP protocol itself is very simple. No matter who sends the request, the response will be returned. Therefore, not confirming the communication party will have the following risks:
      1. There is no way to determine whether the Web server to which the request was sent to the target is the one that returned the response as intended. It could be a masqueraded Web server.
      2. It is not possible to determine whether the response is returned to the client that received the response as intended. It could be a client in disguise.
      3. Cannot determine whether the peer you are communicating with has access rights. Because some Web servers hold important information, they only want to give specific users permission to communicate.
      4. It is impossible to determine where the request came from or from whom. Even meaningless requests are accepted in full. Denial of Service (DoS) attacks due to massive requests cannot be prevented.

  1. Find out your opponent’s credentials
    1. Although it is not possible to identify the communicating party using the HTTP protocol, it is possible using SSL. SSL not only provides encryption, but also uses a means called certificates that can be used to determine parties.
    2. A certificate is issued by a trusted third-party organization to prove that the server and client exist. In addition, forging certificates is technically extremely difficult. Therefore, as long as we can confirm the certificate held by the communicating party (server or client), we can judge the true intention of the communicating party.

Certificates are used to prove that the communicating party is the intended server. This will reduce the risk of personal information leakage.

In addition, the client with a certificate can complete the confirmation of personal identity, can also be used for Web site authentication.

1.3 Packet integrity cannot be proved and may have been tampered with

Completeness refers to the accuracy of information. Failure to prove its completeness usually means that it is impossible to determine whether the information is accurate.

The content received may be incorrect

Since THE HTTP protocol cannot prove the packet integrity of communication, there is no way to know if the content of the request or response is tampered with during the period between the sending of the request or response and the receiving of the response. In other words, there is no way to confirm that the request/response sent and the request/response received are identical.

For example, when downloading content from a Web site, it is impossible to determine whether the files downloaded by the client are consistent with the files stored on the server. The contents of the file may have been changed to something else in transit. Even if the content does change, the client, the recipient, will not be aware of it. An attack like this, in which the request or response is in transit and the attacker blocks and tampers with the content, is called a man-in-the-middle attack (MITM).

How to prevent tampering

Although there are methods to determine packet integrity using THE HTTP protocol, they are not convenient and reliable. The commonly used methods are MD5 and SHA-1 hash value verification, as well as the digital signature method used to confirm the file.

Unfortunately, there is still no 100% guarantee that the results are correct using these methods. Because PGP and MD5 themselves are overwritten, there is no way for the user to realize it. To effectively prevent these drawbacks, it is necessary to use HTTPS. SSL provides authentication, encryption, and summary functions. It is very difficult to ensure integrity with HTTP alone.

2 HTTPS

2.1 Overview of HTTPS

If unencrypted plaintext is used during HTTP communication, such as entering a credit card number on a Web page, the credit card number is exposed if the line of communication is eavesdropped.

In addition, with HTTP, there is no way for either the server or the client to acknowledge the communicating party. Because there’s a good chance you’re not actually communicating with the person you thought you were communicating with. It also takes into account the possibility that the received message has been tampered with en route.

To solve these problems, encryption and authentication mechanisms need to be added to HTTP. We call HTTP with the added encryption and authentication mechanism HTTPS (HTTPSecure). In other words, HTTP plus encryption processing and authentication and integrity protection is HTTPS.

2.2 HTTPS is HTTP with SSL shell

HTTPS is not a new protocol at the application layer. The HTTP communication interface is replaced by the Secure Socket Layer (SSL) and Transport Layer Security (TLS) encryption protocols.

In general, HTTP communicates directly with TCP. When SSL is used, it evolves to communicate with SSL first, and then with SSL and TCP. In short, HTTPS is HTTP in the shell of the SSL protocol.

With SSL, HTTP has the encryption, certificate, and integrity protection features of HTTPS.

SSL is independent of HTTP. Therefore, not only HTTP, but also SMTP and Telnet at the application layer can work with SSL. It can be said that SSL is the most widely used network security technology in the world.

The predecessor of TLS was SSL, and TLS 1.0 is usually identified as SSL 3.1, TLS 1.1 as SSL 3.2, and TLS 1.2 as SSL 3.3. In many cases SSL is used to refer to SSL and TLS.

2.3 Public key encryption technology for mutual exchange of keys

SSL uses an encryption process called public-key cryptography, also known as asymmetric cryptography.

In modern encryption methods, the encryption algorithm is public, but the key is secret. In this way, the encryption method is kept secure. Both encryption and decryption use keys. The password cannot be decrypted without the key, and conversely, anyone with the key can decrypt it. If the key is acquired by an attacker, the encryption is meaningless.

2.3.1 Dilemma of Shared Key Encryption

A method of encryption and decryption using a single key is called a Common key crypto system, or symmetrical encryption.

If the shared key is used for encryption, you must also send the key to the other party. But how exactly do you deliver it safely? When forwarding a key over the Internet, if the communication is monitored, the key can fall into the hands of an attacker, and the encryption loses its meaning. Also try to secure the keys you receive.

2.3.2 Public key encryption

Public key encryption is a good way to solve the problem of shared key encryption.

Public-key encryption uses an asymmetric pair of keys. One is called a private key, and the other is called a public key. As the name implies, a private key cannot be made known to anyone else, while a public key can be released at will and accessible to anyone.

In the public-key encryption mode, the sender uses the public key of the peer party to encrypt the encrypted message. After receiving the encrypted message, the peer party uses its private key to decrypt the encrypted message. In this way, there is no need to send the private key used for decryption, and there is no need to worry that the private key will be eavesdropped and stolen by an attacker.

In addition, it is extremely difficult to recover the information from the ciphertext and the public key, because the decryption process is to evaluate the discrete logarithm, which is not easy to do. To say the least, if you can factor a very large integer very quickly, then there is hope for password cracking. But with current technology, it’s not realistic.

2.4 HTTPS uses the mixed encryption mechanism

HTTPS uses a hybrid encryption mechanism that combines shared key encryption and public key encryption. If keys can be exchanged securely, then it may be possible to consider using only public key encryption for communication. However, the processing speed of public key encryption is slower than that of shared key encryption.

Therefore, we should make full use of their respective advantages and combine a variety of methods for communication. The public key encryption is used in the key exchange process, and the shared key encryption is used in the communication exchange message establishment phase.

2.5 Digital signature to verify normal data

Although people don’t know what the private key is, they can’t get the original data, but they can get the encrypted data, they can change some part of the data and then send it to the server, so the server gets the data is not complete. The data may have been tampered with, and we can use a digital signature to solve the tampered data problem.

A digital signature (also known as a public key digital signature) is a string of numbers that can only be generated by the sender of a message and cannot be forged by others. This string is also a valid proof of the authenticity of the message sent by the sender. It can also be seen as the inverse of asymmetric encryption (the sender uses a private key to encrypt, the receiver uses a public key to decrypt, and the source is assured). Here’s how it works:

  1. The sender A hashes the original message to obtain the hash value H1, which is called the digest.
  2. The digest is then encrypted with a private key. The resulting message is known as a “signature,” which can only be generated by users with a private key.
  3. The signature and message are sent to receiver B.
  4. After receiving the message, B uses the public key of A to decrypt the digital signature. If the decryption succeeds, the message does come from A and the hash value h1 of the sender is obtained. If the decryption fails, it indicates that someone is impersonating.
  5. Then B performs the hash operation on the message body to obtain the hash value h2. Compare H2 and H1. If the two hash values are consistent, it means that the message topic has not been changed; if not, it means that it has been tampered with.

The step of decrypting the signature with the public key is called verification signature, and all users can verify the signature (because the public key is public). Once the signature verification is successful, according to the mathematical correspondence between the public and private keys, it is known that the message was sent by the only user who has the private key, not any user.

Because the private key is unique, a digital signature ensures that the sender cannot later repudiate the signature of the message. Thus, the receiver of the message can digitally sign to convince third parties of the identity of the signer and the fact that the message was sent. When there is a debate about whether a message should be sent or not and what it should be, a digital signature can be a powerful evidence.

In addition, by comparing the hash value obtained by decrypting the digital signature with the hash value obtained by the hash calculation of the message, the receiver can also determine that the message has not been tampered with!

2.6 Digital certificate that proves the correctness of the public key

The problem with public key encryption is that it cannot be proved that the public key itself is a real public key. For example, when you are preparing to communicate with a server in public-key encryption mode, how can you prove that the public key received is the one originally intended to be issued by the server? Perhaps along the way, the real public key has been replaced by the attacker.

To solve the above problems, public key certificates (digital Certificates) issued by CA, Certificate Authority (CA), and other related issues can be used.

The DIGITAL Certificate Authority stands in the position of a third party that both the client and the server can trust. The process for a digital Certificate authority is:

  1. The server operator provides the application for a public key to the digital Certificate authority.
  2. After identifying the identity of the applicant, the digital certificate Authority will digitally sign the public key that has been applied for, and then distribute the signed public key, and bind the public key into the public key certificate.
  3. The server sends the public key certificate issued by the DIGITAL Certificate Authority to the client for public key encryption. A public key certificate can also be called a digital certificate or simply a certificate.
  4. The client receiving the certificate can use the public key of the digital Certificate Authority to verify the digital signature on that certificate. Once the authentication is successful, the client can know two things: First, the authenticating public key of the server is a real and valid digital Certificate Authority. Two, the public key of the server is trustworthy.
  5. Here the public key of the authentication authority must be transferred securely to the client. When using communication, it is difficult to secure the transfer. As a result, when most browser developers release versions, they build in public keys from common certification authorities.

2.7 HTTPS Is faulty

The problem with HTTPS is that it can be slow to process when using SSL.

There are two types of SSL slowness. One is that communication is slow. The other is that the processing speed slows down due to the high consumption of CPU and memory resources.

The network load can be anywhere from 2 to 100 times slower than using HTTP. In addition to making TCP connections and sending HTTP requests/responses, you must also make SSL communications, so the overall processing traffic will inevitably increase.

Another point is that SSL must be encrypted. Both the server and the client need to encrypt and decrypt the operation. As a result, both server and client hardware resources are consumed more than HTTP, resulting in increased load.

There is no fundamental solution to the slow down problem, and we use hardware like SSL accelerators to fix it. This hardware is dedicated to SSL communication. Compared with software, it can improve the computing speed of SSL by several times. Use SSL accelerators only for SSL processing to share load.

2.8 Differences between HTTP and HTTPS

  1. HTTP urls start with http://, while HTTPS urls start with https://.
  2. HTTP is not secure, whereas HTTPS is.
  3. The standard HTTP port is 80, while the standard HTTPS port is 443.
  4. In the OSI network model, HTTP works at the application layer, while HTTPS’s SSL secure transport mechanism works between the application layer and transport layer.
  5. HTTP cannot encrypt, whereas HTTPS encrypts the transmitted data.
  6. HTTP does not require a certificate, whereas HTTPS requires an SSL certificate issued by the CA.

3 Confirm the authentication of the user

HTTP/1.1 uses the following user authentication modes:

  1. BASIC Certification
  2. DIGEST Certification
  3. SSL client authentication
  4. FromeBase (based on forms)

3.1 BASIC authentication

BASIC authentication is defined in HTTP/1.0. Even now, there are some sites that use this type of authentication. Indicates the authentication mode between the Web server and the communication client.

Although BASIC authentication uses Base64 encoding, it is not encrypted. It can be decoded without any additional information. In other words, since the user ID and password are encoded in plain text, there is a high possibility of theft if people eavesdropped on the BASIC authentication process over non-encrypted communication lines such as HTTP.

In addition, when you want to perform BASIC authentication again, the general browser is unable to implement authentication logout operation, which is also a problem.

BASIC authentication is not often used because it is not easy and flexible to use and does not achieve the level of security that most Web sites expect.

3.2 DIGEST authentication

In DIGEST authentication, challenge/response is also used, but it does not directly send plaintext codes as in BASIC authentication. Challenge response means that one party will first send the authentication request to the other party, and then use the challenge code received from the other party to calculate the response code. The mode in which the response code is returned to the peer for authentication.

The authentication procedure for DIGEST authentication is as follows:

3.3 SSL Client Authentication

In terms of authentication mode using user ID and password, as long as the content of both is correct, the authentication is the behavior of the user. But if the user ID and password are stolen, it is likely to be impersonated by a third party. Using SSL client authentication can avoid this situation.

SSL client authentication is performed using an HTTPS client certificate. With client certificate authentication, the server can confirm whether the access is from a logged in client.

In most cases, SSL client authentication does not rely on certificates alone, but is used in combination with form-based authentication (described later) to form a two-factor authentication. The so-called two-factor authentication means that not only the password is required in the authentication process, but also the applicant needs to provide other holding information, so as to serve as another factor, and the combination of the authentication method.

In other words, the SSL client certificate of the first authentication factor is used to authenticate the client computer, and the password of the other authentication factor is used to determine that this is the user’s own behavior.

After two-factor authentication, it can be confirmed that it is the user himself who is accessing the server using the correctly matched computer. Client certificates require a fee to be used.

4.4 Form-based authentication

Form-based authentication methods are not defined in the HTTP protocol. The client sends the login information to the Web application on the server for authentication based on the authentication result of the login information.

Depending on the actual installation of the Web application, the provided user interface and authentication mode will vary.

In most cases, login information such as a pre-logged user ID (usually any string or email address) and password is entered and sent to the Web application, which determines whether the authentication is successful based on the authentication results.

BASIC and DIGEST authentication provided by the HTTP protocol standard are rarely used due to their convenience and security. In addition, although SSL client authentication has a high level of security, it has not been popularized because of the import and maintenance costs. Authentication is mostly form-based.

Form-based authentication uses cookies to manage sessions. A Web application on the server matches the user ID and password sent by the client with the previous login information for authentication. Use cookies to manage sessions to compensate for state management that does not exist in the HTTP protocol.

Reference:

Illustrated HTTP

If you need to communicate, or the article is wrong, please leave a message directly. Also hope to like, collect, follow, I will continue to update a variety of Java learning blog!