Figure out HTTP and HTTPS in 10 minutes?

1. What is an agreement?

Network protocol is a kind of “convention” or “rule” reached between computers in order to realize network communication. With this kind of “convention”, the production equipment of different manufacturers and the computers composed of different operating systems can realize communication.

2. What is HTTP protocol?

HTTP is short for Hyper Text Transfer Protocol. It is a transport protocol for transporting hypertext Markup Language (HTML) from a WEB server to a local browser.

HTTP was originally designed to provide a way to publish and receive HTML pages.

There are several versions of HTPP, and the most widely used version is HTTP/1.1.

3. The principle of HTTP

HTTP is a protocol based on TCP/IP communication protocol to transfer data, data type of transmission for HTML files, picture files, query results and so on.

The HTTP protocol is generally used in THE B/S architecture (). The browser, as the HTTP client, sends all requests to the HTTP server, that is, the WEB server, through the URL.

Let’s take visiting Baidu as an example:

Baidu visit process

4. HTTP features

HTTP is a request/response protocol that supports client/server mode.
Simple and fast: When a client requests service from a server, only the request method and path need to be passed. The common request methods are GET, HEAD, and POST.
Flexibility: HTTP allows the transfer of data objects of any type. The Type of transmission is marked by content-type.
Connectionless: Restricts processing to one request per connection. After the server processes the request and receives the reply from the client, it disconnects, but it is not good for the client to maintain the Session connection with the server. In order to make up for this shortage, there are two techniques for recording THE HTTP state, one is called Cookie, the other is called Session.
Stateless: Stateless means that the protocol has no memory of the transaction. If the previous information is required for subsequent processing, it must be retransmitted.

5. Differences between URIs and urls

HTTP uses Uniform Resource Identifiers (URIs) to transfer data and establish connections.

URI: Uniform Resource Identifier Uniform Resource Identifier
URL: Uniform Resource Location Uniform Resource Location

Uris are used to identify a specific resource, we can know what a resource is by THE URI.

The URL is used to locate a specific resource, indicating the location of a specific resource. Every file on the Internet has a unique URL.

6. Composition of HTTP packets

Composition of request Message

Request line: includes the request method, URL, and protocol/version
Request Header
Request body

Composition of request Packets

Response Message composition

The status line
Response headers
In response to the body

Response Message Composition

7. Common request methods

GET: Requests the specified page information and returns the entity body.
POST: Submit data to a specified resource for processing a request (for example, submit a form or upload a file). The data is contained in the request body. POST requests may result in the creation of new resources and/or the modification of existing resources.
HEAD: This is similar to a GET request, except that the response is returned with no specific content, which is used to retrieve the header
PUT: Data sent from the client to the server replaces the contents of the specified document.
DELETE: Requests the server to DELETE the specified page.

A get request

A GET request

A post request

A POST request

The difference between POST and GET

Both contain request header request line, post more request body.
Get is mostly used to query, request parameters are placed in the URL, does not have an effect on the content on the server. Post is used to submit, such as putting the account password in the body.
GET is directly added to the URL so that users can directly see the content in the URL, while POST is placed inside the packet and users cannot directly see it.
The length of data that can be submitted by GET is limited because the URL length is limited, which depends on the browser. POST doesn’t.

8. Response status code

When you visit a web page, the browser makes a request to the Web server. The server where the page is located responds to the browser’s request by returning an information header containing an HTTP status code.

Status code classification:

1XX- Informational. The server receives the request and needs the requester to continue the operation.
2XX- Successful. The request was received, understood, and processed successfully.
3XX – Redirection, requiring further action to complete the request.
4XX – Client error, request contains syntax error or request cannot be completed.
5XX – A server error occurred while the server was processing the request.

Common status code:

200 OK – The client request succeeds
301 – Resources (web pages, etc.) are permanently transferred to other URLS
302 – Temporary jump
400 Bad Request – The client Request has syntax errors and cannot be understood by the server
401 Unauthorized – The request is Unauthorized. This status code must be used with the WWW-Authenticate header field
404 – The requested resource does not exist, possibly an incorrect URL was entered
500 – An unexpected error occurred inside the server
503 Server Unavailable – The Server is currently unable to process requests from the client. The Server may recover after a while.

9. Why use HTTPS?

In practice, most websites are using HTTPS protocol, which is also the trend of future Internet development. The following is the login request process for a blog site captured by Wireshark.

Blog login packet capture

It can be seen that the accessed accounts and passwords are transmitted in plain text, so the request sent by the client is easy to be intercepted and used by criminals. Therefore, HTTP protocol is not suitable for the transmission of some sensitive information, such as various accounts and passwords, and the transmission of private information using HTTP protocol is very insecure.

Generally, HTTP has the following problems:

The request information is transmitted in plain text, which is easy to intercept by eavesdropping.
Data integrity is not verified and is easy to be tampered with
If the identity of the other party is not verified, there is a danger of impersonation

10. What is HTTPS?

To solve the above HTTP problems, HTTPS is used.

HyperText Transfer Protocol Over Secure Socket Layer (HTTPS) : Generally referred to as HTTP+SSL/TLS. The SSL certificate is used to authenticate the identity of the server and encrypt the communication between the browser and the server.

So what about SSL?

SSL (Secure Socket Layer) : Developed by Netscape in 1994, SSL stands between TCP/IP and various application Layer protocols to provide Secure support for data communications.

Transport Layer Security (TLS) : Its predecessor is SSL, and its first several versions (SSL 1.0, SSL 2.0, SSL 3.0) were developed by Netscape Company. It was standardized and renamed by IETF from 3.1 in 1999. Until now, there have been three versions (TLS 1.0, TLS 1.1, TLS 1.2). SSL3.0 and TLS1.0 are rarely used due to security vulnerabilities. TLS 1.3, which is still in draft stage, is subject to major changes. Currently, TLS 1.1 and TLS 1.2 are the most widely used.

History of SSL (Encrypted Internet Communications)

NetSpace designed SSL protocol (Secure Sockets Layout) version 1.0 in 1994, but it was not released.
NetSpace released SSL/2.0 in 1995 and soon discovered serious vulnerabilities
SSL/3.0 was released in 1996 and was widely used
In 1999, an SSL upgrade, TLS/1.0, was released and is currently the most widely used version
In 2006 and 2008, versions TLS/1.1 and TLS/1.2 were released

11. What is the process by which the browser transfers data using HTTPS?

HTTPS data transfer process

First, the client accesses the server through the URL to establish an SSL connection.
After receiving a request from a client, the server sends a certificate (including the public key) supported by the website to the client.
The server on the client starts to negotiate the security level of the SSL connection, that is, the level of information encryption.
The browser of the client establishes the session key according to the security level agreed by both parties, encrypts the session key using the public key of the website, and sends it to the website.
The server uses its own private key to decrypt the session key.
The server encrypts the communication with the client using the session key.

12. The shortcoming of the HTTPS

Multiple HTTPS handshakes prolong the page loading time by nearly 50%.
HTTPS connection caching is not as efficient as HTTP, which increases data overhead and power consumption.
It costs money to apply for an SSL certificate, and the more powerful the certificate, the higher the fee.
The security algorithm involved in SSL consumes CPU resources and consumes a large amount of server resources.

13. Summarize the differences between HTTPS and HTTP

HTTPS is the secure version of HTTP. Data transmitted over HTTP is in plaintext and is not secure. HTTPS uses the SSL/TLS protocol for encryption.
HTTP and HTTPS use different connection modes, and the default port is different. The HTTP port is 80, and the HTTPS port is 443.

Welcome to pay attention to the public account “I am different”, focus on testing technology, Python knowledge, programmer resources, career growth.