preface

When you surf the Web, the HTTP protocol is everywhere. When you browse a web page, get a picture, or a video, HTTP is happening.

This article will try to get you through the basics of HTTP with short examples and necessary instructions.

Directory:

  1. What is HTTP?
  2. HTTP, brief;
  3. HTTP and HTTPS.

Part 1. What is HTTP?

The Internet is about communication between Web clients and Web servers.

HyperText Transfer Protocol (HTTP) is also called HyperText Transfer Protocol. It’s essentially an agreement on how the two parties will communicate.

It’s like sending a “1?” to a group of friends I play with. “Friends immediately know that they are asking if they want to play games tonight.

But if I send someone “1?” There could be problems: they don’t know what I’m talking about.

In essence, this is what the HTTP protocol stands for. We have agreed that if we send a message in a certain way, the server will understand the intent of the message and respond.

Part 2. A brief history of HTTP

In March 1989, the Internet belonged to a minority. In this dawn of the Internet, HTTP was born.

HTTP / 0.9 – Single-line protocol

In 1989, Tim Berners-Lee, then working at CERN, the European Organization for Nuclear Research, proposed an idea that would allow far-flung researchers to share knowledge.

Originally known as Mesh, it was later renamed the World Wide Web during implementation in 1990. It is built on the existing TCP/IP protocol and consists of four parts:

  • A text format for representing hypertext documents, namely hypertext Markup Language (HTML);
  • A simple protocol for exchanging these documents, called HyperText Transport Protocol (HTTP);
  • A client could display these documents, and the first Web browser was called WorldWideWeb.
  • A server that can access documents;

These four parts were completed by the end of 1990. Although the Web page can only display simple text content at this time, the browser can only display mechanical text information, but this has basically met the original intention of establishing a Web site, the realization of information resource sharing.

Here is the HTTP/0.9 request:

GET /page.html
Copy the code

Retrieves the specified document from the target server using the only available GET method. (Protocol, server, port number are not required once connected to the server.)

The response is also extremely simple: just the document itself.

<HTML>Content of web page</HTML>
Copy the code

This means that HTTP/0.9 can only transfer HTML files. When a problem occurs, a special HTML file containing information describing the problem is sent back for people to view.

HTTP/1.0 – Build scalability

Due to the limited use of HTTP/0.9, and the rapid growth of HTTP usage and HTML, browsers and servers rapidly expanded their content to make it more versatile:

  • Protocol version information is sent with each request;
-- -- -- -- -- -- -- -- -- -- HTTP / 0.9 requests -- -- -- -- -- -- -- -- -- -- GET/page. The HTML -- -- -- -- -- -- -- -- -- -- HTTP / 1.0 requests -- -- -- -- -- -- -- -- -- -- GET/page. The HTTP / 1.0 HTML > new protocol versionCopy the code
  • The server responds with a status code so that the browser can understand the success or failure of the request and adjust its behavior accordingly (such as update or failure);
----------HTTP/0.9 response ---------- <HTML>.... < / HTML > -- -- -- -- -- -- -- -- -- -- HTTP / 1.0 response -- -- -- -- -- -- -- -- -- -- new status code 200 OK - > < HTML >... </HTML>Copy the code
  • The concept of HTTP headers was introduced to allow other information to be transmitted, whether in response or request, making protocols more flexible and extensible.

  • With the help of HTTP headers, it is possible to transfer HTML files in addition to plain textTransfer other types of documentsBe attributed toContent-TypeHead);

The HTTP/0.9 specification is about one page long, while the HTTP/1.0 specification defined in RFC-1945 is a full 60 pages long. This shows how HTTP has grown into an important tool.

Although HTTP/1.0 is a big leap from HTTP/0.9, there are still many known flaws that must be addressed. For example, TCP does not interact well and caching is not fully considered.

Take the example of poor TCP interaction. Because HTTP is established based on TCP, the connection must be established before communication and disconnected after communication.

HTTP/1.0 requires the establishment and disconnection of every communication, which adds unnecessary communication overhead.

HTTP/1.1 – Standardized protocol

The document RFC 1945 defines HTTP/1.0, but it is narrowly defined and is not an official standard. So it’s very confusing to actually use it. So revisions to the first standardized version of HTTP actually began in 1995, the year after the HTTP/1.0 document was released.

HTTP/1.1 was released as RFC 2068 in January 1997. HTTP/1.1 removes a lot of ambiguity and introduces several improvements:

  • The connection can be reused, saving the time of loading web page document resources by opening TCP connection for many times.

  • Add pipelining technology to reduce communication latency by allowing a second request to be sent before the first reply has been fully sent;

  • Support response chunking;

  • Additional Cache Control mechanisms were introduced, with a number of options available in the HTTP cache-control header;

  • Introduce content negotiation mechanisms, including language, encoding, type, etc., and allow the client and server to exchange the most appropriate content by convention;

  • Different domain names can be configured on the same IP address server.

A typical request flow, with all requests implemented through a connection, would look something like this:

Over 15 years of expansion

Because of HTTP’s extensibility – it is easy to create new headers and methods – the HTTP protocol has been used steadily for more than 15 years. During this period, HTTP/1.1 protocol was constantly revised (RFC 2616, RFC 7230, RFC 7235), which made a full foundation for HTTP/2.0.

HTTP/2.0 – For better performance

Over the years, the web has grown more complex, even into proprietary applications, the amount of visible media being played, and the size of scripts that facilitate interaction have increased a lot: more data is being transferred over HTTP requests.

Between 2010 and 2015, Google demonstrated the feasibility of the experimental SPDY protocol, which became the basis for the HTTP/2 protocol.

HTTP/2 differs from HTTP/1.1 in several basic ways:

  • HTTP/2 is a binary protocol rather than a text protocol and is no longer readable. Both header information and data bodies are binary (smaller) and collectively referred to as frames.

  • This is a multiplexing protocol that can be multiplexed. Parallel requests can be processed within the same link, removing the order and blocking constraints of HTTP/1.x;

* Note: HTTP/2 is not merged into one package, but is sent into multiple streams, just for drawing convenience.

You can get a sense of how much faster HTTP/2 is than HTTP/1.1 by clicking here.

  • I’m compressing Headers. Because Headers is often similar across a series of requests, it removes duplication and the cost of transferring duplicate data. The algorithm to achieve this function is called HPACK algorithm;

  • It allows the server to populate the client cache with data requested in advance through a mechanism called server push.

See the 4 links below for details of HTTP/2 excellence

After being formally standardized in May 2015, HTTP/2 was a great success, with 8.7% of sites using it by July 2016. High-traffic sites spread most quickly, saving considerable costs and expenses on data transmission.

This rapid adoption is most likely due to the fact that HTTP2 does not require site and application changes: using HTTP/1.1 and HTTP/2 is transparent to them.

It’s enough to have an up-to-date server and interact with a newer browser. Only a small percentage of people need to change, and as older browsers and servers are updated without the need for Web developers to do anything, the number of people using them naturally increases.

After the HTTP / 2 evolution

With the release of HTTP/2, like HTTP/1.x before it, HTTP did not stop evolving. HTTP extensibility is still being used to add new functionality.

The evolution of HTTP has proven its extensibility and simplicity, freeing up the creativity and willingness of many applications to use the protocol.

HTTP/3 – A better future

HTTP/3 is the third major version of the HTTP protocol to come. Different from its predecessor, TCP is deprecated in HTTP/3 and implemented using UDP and QUIC instead.

This change is intended to address header blocking in HTTP/2. Because HTTP/2 uses multiplexing over a single TCP connection, affected by TCP congestion control, a small amount of packet loss can cause all streams over the entire TCP connection to be blocked.

As of January 2021, HTTP/3 is still in draft status.

summary

  • HTTP/0.9 can only transmit a single HTML plain text, which is not flexible enough.
  • HTTP/1.x has many defects, such as unreusable connection, queue header blocking, high protocol overhead and security factors.
  • HTTP/2 greatly improves performance through multiplexing, binary streaming, Header compression, and so on, but there are still problems;
  • QUIC is based on UDP and is the bottom support protocol in HTTP/3. This protocol is based on UDP and takes the essence of TCP to achieve a fast and reliable protocol.

Part 3. HTTP and HTTPS

Why you need HTTPS

The HTTP protocol was not designed with enough security in mind. There are several risks associated with these HTTP-based applications:

  1. Using plaintext (not encryption) to communicate, the content may be eavesdropped;
  2. Do not verify the identity of the communicating party, which may be disguised;
  3. The integrity of the information cannot be verified, that is, the information may have been tampered with;

HTTPS (HTTP over SSL) uses a new Layer of nested Secure Socket Layer (SSL) to solve the security problem of network transmission.

How to prevent wiretapping?

Encryption is an easy solution. But how to ensure that the process of transmitting encryption methods is not eavesdropped?

At this time the emergence of asymmetric encryption to solve this problem. It revolutionized the division of cryptography into public and private keys, called asymmetric encryption because the two secret keys were not the same.

For example, suppose we now want to encrypt the character 520, we encrypt it by multiplying the number by 91 and publishing the last three digits of the result:

Note: here91Equivalent to a public key, anyone can know.

We can’t decrypt it by dividing it by 91, but by x11, taking the last three digits of the result:

Note: herex11It’s a private key that only the decryptor knows.

That’s because 91 times 11 is 1001, and any three-digit number multiplied by 1001 obviously doesn’t change the last three digits. This is probably the principle of asymmetric encryption, based on this principle, both sides of our communication can generate their own public and private keys and carry out relatively secure communication.

How to verify the identity of the other party?

The above process may seem watertight, but in TCP/IP end-to-end communication, the journey is long and the dream is long.

If in the second step, the message is intercepted by a hacker, and under torture, it is known that the message is transmitting the public key. Then you can generate a pair of key and public key, posing as each other to transmit their own secret keys.

The crisis of encryption was followed by a crisis of trust. We needed a credible organization to prove identity, and the problem was solved.

The trusted organization is the Certificate Authority (CA) that issues HTTPS certificates. Each time a client or server wants to disclose its public key, it needs to apply to the CA. After passing the request, the CA issues a digital certificate bound to the public key. (Learn more about certificates)

During HTTPS communication, the server sends the certificate to the client. After obtaining the public key, the client authenticates the certificate. If the authentication succeeds, the client can start communication.

How to prevent tampering?

When we talked about bitcoin, we talked about a hash algorithm. Its function is to be able to program an arbitrary length input into a fixed length binary output.

Note: The right side is a hexadecimal number for simplicity

In HTTPS, there is a new digest algorithm, which can be simply understood as a compression of content. So any change in content, even if it’s a punctuation mark, will result in the wrong hash of a compressed number.

Before sending plaintext, the client uses the digest algorithm to calculate the fingerprint of plaintext, encrypts both fingerprint and plaintext into ciphertext, and sends the text to the server.

After decryption, the server uses the same digest algorithm to calculate the plaintext sent from the client, and compares the “fingerprint” carried by the client with the “fingerprint” currently calculated. If the “fingerprint” is the same, it indicates that the data is complete.

How is HTTP different from HTTPS?

While IT sounds like HTTPS is the more secure HTTP, there are a number of subtle differences:

  1. HTTP plaintext transmission has security risks. HTTPS addresses the vulnerability of HTTP. SSL/TLS is added between TCP and HTTP to encrypt packets.
  2. It is relatively easy to establish an HTTP connection. HTTP packets can be transmitted after TCP three-way handshake. After the TCP three-way handshake, HTTPS also needs to perform the SSL/TLS handshake to transmit encrypted packets.
  3. The HTTP port number is 80 and the HTTPS port number is 443.
  4. HTTPS applies for a digital certificate from a certificate Authority (CA) to ensure that the identity of the server is trusted.

Afterword.

The HTTP protocol is the foundation of the complex web world, and it ensures “seamless communication” between applications. This article is also clearly expressed in the form of GIF as much as possible. I hope you can enjoy your meal.

At this point, we have a pretty good understanding of the HTTP protocol. In the future, I will continue to learn the basic knowledge of computer network with you and try to follow the steps of the back-end learning roadmap.

The resources

  1. How HTTP Works and according to it ‘s Important – Explained in Plain English – www.freecodecamp.org/news/how-th…
  2. Evolution of HTTP – developer.mozilla.org/en-US/docs/…
  3. The Evolution of HTTP – www.oreilly.com/library/vie…
  4. XxxxHub uses HTTP/2. What’s so cool about it? | kobayashi Coding – mp.weixin.qq.com/s/TvGAmKKrK…
  5. HTTP HTTP / / 2 and 3 are read features – blog.fundebug.com/2019/03/07/…

(after)

Good recommendation

  1. Weekly Highlights share (Issue 3) : The B side of starting a Good Business
  2. Surprised! The secret of the original browser is hidden in these 31 pictures!
  3. What is DNS? How does it work?
  4. 28 a graphic | Internet is “how to connect, how to communicate”?

Here is my three hearts, welcome to pay attention to the public account wmyskxz, is sharing a week of learning and harvest every week, 2021, with you on the way to Be Better grow together!