The history of


link

A long, long time ago, the Web was basically just browsing documents. Since browsing, as a server, there was no need to keep track of who was browsing what documents at a certain time. Every request was a new HTTP protocol, request plus response, especially since I didn’t have to remember who just sent the HTTP request. Each request is new to me. It’s been a good time

2, but with the rise of interactive Web applications like online shopping website, need to log in to the site, etc., will soon face a problem, that is to manage session, must remember who login system, who puts items to their shopping cart, that is to say I must distinguish everyone, this is a big challenge, Since HTTP requests are stateless, the idea is to send everyone a session ID, which is basically a random string that each person receives differently, along with each HTTP request they send me, so I can tell who is who

3, everyone is very happy, but the server is not happy, everyone only needs to save their own session ID, and the server needs to save everyone’s session ID! If you access more servers, you have to have tens of thousands, even hundreds of thousands.

For example, I use two machines to form A cluster, and F logs in to the system through machine A, so the session ID will be saved on machine A. Suppose F’s next request is forwarded to machine B? Machine B does not have the session ID of little F.

If machine A dies, it will have to switch to machine B. If machine A dies, it will have to switch to machine B.

So I’m going to have to make a copy of the session, moving the session ID between the two machines, which is exhausting.

      

Then there was Memcached: Storing session ids centrally in one place, where all machines can access the data, eliminates the need for replication, but increases the possibility of a single point of failure. If the session machine dies, everyone has to log in again, probably cursed to death.

      

I also tried to cluster this single point machine to increase reliability, but anyway, this small session is a heavy burden for me

4 So some people keep thinking, why should I save this horrible session? It would be better for each client to save it.

If I do not save the session ID, how can I verify that the session ID sent to me by the client is actually generated by me? Without verification, we don’t even know if they are legitimate logins, so malicious guys can fake session ids and do whatever they want.

Well, the key is validation!

For example, if F has logged in to the system, I issue him a token, which contains F’s user ID. Next time F requests to access me through Http, he just needs to bring this token through Http header.

But it’s not that different from the session ID, because anyone can forge it, so I have to do something to make it impossible.

Then make a signature for the data. For example, I use HMAC-SHA256 algorithm, add a key that only I know, make a signature for the data, and use the signature and the data as a token. Since the key is unknown to others, it is impossible to forge the token.

I will not save this token. When F sends this token to me, I will use the same HMAC-SHA256 algorithm and the same key to calculate the signature of the data again and compare it with the signature in the token. If they are the same, I will know that F has logged in. In addition, the user ID of small F can be directly obtained. If it is different, the data part must have been tampered with. I will tell the sender: Sorry, there is no authentication.

The data in the Token is in clear text (although I do Base64 encoding, it’s not encrypted) and can still be seen by others, so I can’t store sensitive information like passwords in it.

Of course, if a person’s token is stolen by others, I can’t help it. I will also think that the thief is a legitimate user, which is actually the same as a person’s session ID is stolen by others.

In this way, INSTEAD of saving the session ID, I just generated the token and verified the token, and I used my CPU computing time to get my session storage space!

Relieved of the burden of session ids, my cluster of machines can now easily scale horizontally, increasing user visits and simply adding machines. This feeling of statelessness is so good!

Cookie

A cookie is a very specific thing. It refers to a kind of data that can be stored permanently in the browser. It’s just a data storage function implemented by the browser.

The cookie is generated by the server and sent to the browser, which saves the cookie in the form of KV in a text file under a directory. The cookie will be sent to the server when the same website is requested next time. Since cookies are stored on the client, browsers put in some restrictions to ensure that cookies can’t be used maliciously and don’t take up too much disk space, so the number of cookies per field is limited.

Session

A session is literally a session. It’s kind of like when you’re talking to someone, how do you know that you’re talking to A Joe and not a Joe? There must be something about the other person that says he is Zhang SAN.

Sessions are similar in that the server needs to know who is currently sending the request to it. To do this, the server assigns a different “id” to each client, and then each time the client sends a request to the server, it carries this id so that the server knows who the request is coming from. As for how the client saves this “identity”, there are many ways, for the browser client, everyone uses cookie by default.

The server uses a session to temporarily store user information on the server, which is destroyed when the user leaves the site. This method of storing user information is more secure than cookies, but sessions have a drawback: if the Web server is load-balanced, the session will be lost when the next operation request reaches another server.

Token

Token-based authentication is ubiquitous in the Web world. Tokens is the best way to handle authentication across multiple users in most Internet companies that use Web APIS.

The following features allow you to use token-based authentication in your applications

1. Stateless and extensible

2. Support mobile devices

3. Cross-program invocation

4. Security

The big boys who use token-based authentication

Most oF the apis and Web applications you’ve seen use Tokens. Facebook, Twitter, Google+, GitHub, etc.

The origin of the Token

Before introducing the principles and advantages of token-based authentication, let’s take a look at what has been done before.

Server-based authentication

We all know that the HTTP protocol is stateless, which means that the program needs to verify every request to identify the client.

Until now, applications have identified requests based on login information stored on the server. This is usually done by storing sessions.

The following figure shows how server-based authentication works

With the rise of the Web, applications, and mobile, this approach to validation is becoming increasingly problematic. Especially in terms of scalability.

Some issues exposed by server-based authentication

1.Seesion: Each time an authentication user initiates a request, the server needs to create a record to store information. As more and more users make requests, the memory overhead increases.

2. Scalability: Seesion is used to store login information in the server memory, which brings scalability problems.

3.CORS(Cross-domain resource sharing) : When we need to use data across multiple mobile devices, cross-domain resource sharing will be a headache. When using Ajax to fetch resources from another domain, requests can be disabled.

4.CSRF(Cross-site request forgery) : when users visit bank websites, they are vulnerable to cross-site request forgery and can be used to access other websites.

Of these, extensible rows are the most prominent. Therefore, it is necessary for us to seek a more effective method.

Token-based authentication principle

Token-based authentication is stateless and we do not store user information in the server or Session.

This concept solves many problems when storing information on the server side

NoSession means that your application can add and subtract machines as needed without worrying about whether the user is logged in or not.

Token-based authentication is performed as follows:

1. The user sends a request using the user name and password.

2. Program verification.

3. The program returns a signed token to the client.

4. The client stores tokens and uses them to send requests each time.

5. The server verifies the token and returns data.

Each request requires a token. The token should be sent in the HTTP header to ensure that the HTTP request is stateless. We also set access-Control-allow-Origin :* to Allow the server to receive requests from all domains. It is important that certificates such as HTTP authentication, client SSL certificates and cookies are not included in the ACAO header marked (designating)*.

Implementation idea:

1. After the verification succeeds, the user returns the Token to the client.

2. The client saves the received data on the client

3. Each time the client accesses the API, it carries a Token to the server.

4. Verify the filter on the server. If the verification succeeds, the request data is returned; if the verification fails, an error code is returned

Once we authenticate the information in the program and get the token, we can do a lot of things with that token.

We can even create a permisse-based token and pass it to third party applications that can access our data (only with specific tokens we allow of course).

The advantage of Tokens

Stateless and extensible

Tokens stored on the client side are stateless and scalable. Based on this statelessness and not storing Session information, the load balancer can transfer user information from one service to another.

If we store information about authenticated users in a Session, each request requires the user to send authentication information (called Session affinity) to the authenticated server. When there are a large number of users, it may cause

Some congestion.

But don’t worry. With Tokens, all of these problems are solved because Tokens itself holds the user’s authentication information.

security

Sending tokens instead of cookies in a request prevents CSRF(cross-site request forgery). Even if a cookie is used to store tokens on the client side, the cookie is only a storage mechanism and is not used for authentication. By not storing information in the Session, we have less access to the Session.

Tokens are time-limited and users need to re-verify them after a certain period of time. We also don’t have to wait until the token is automatically invalidated. There is a withdrawal of the token. Token revocataion can invalidate a particular token or a group of tokens that have the same authentication.

Scalability ()

Tokens creates programs that share permissions with other programs. For example, the ability to associate a casual social media account with your Twitter account (Facebook or Twitter). When logging into Twitter from a service (we have this process Buffer), we can attach these buffers to The Twitter stream where they are hard to work.

With Tokens, you can provide optional permissions to third party applications. When users want another application to access their data, we can create tokens with special permissions by building our own API.

Multiple platforms cross domains

Let’s start by talking about CORS(Cross-domain resource Sharing), which involves a wide variety of devices and applications when extending applications and services.

Having our API just serve data, we can also make the design choice to serve assets from a CDN. This eliminates the issues that CORS brings up after we set a quick header configuration for our application.

As long as the user has an authenticated token, data and resources can be requested in any domain.

          Access-Control-Allow-Origin: *       
Copy the code

Based on the standard

When creating a token, you can set a few options. We will describe it in more detail in future articles, but the standard usage will be reflected in JSON Web Tokens.

Recent programs and documentation are available for JSON Web Tokens. It supports numerous languages. This means that you can actually switch your authentication mechanism in the future