Usually in order to understand one concept, we need to know ten concepts. Before deciding whether JWT (Json Web Token) can replace session management, we need to understand what tokens are and the difference between access and Refresh tokens. Understand what OAuth is, what SSO is, the difference between OAuth and SAML under SSO, and the difference between OAuth and OpenID. More importantly, distinguish authorisation and authentication. Finally, we introduce JSON WEB Tokens, talk about JWT’s strengths and weaknesses in session management, and try to address these weaknesses to see what the costs and costs are

Examples of OAuth authorization and API calls in this article are from the Google API.

About the Token

Tokens have different definitions even in the computer world, and when we say tokens, we mean credentials to access resources. For example, when you call the Google API, you need to bring a valid token to indicate that your request is valid. This token is given to you by Google, which gives you access to the resources behind the API.

There are many ways to carry a token when requesting an API, either through HTTP headers or URL arguments or through the Google libraries:

// HTTP Header: GET /drive/v2/files HTTP/1.1 Authorization: Bearer www.googleapis.com/ // URL query string parameter GET https://www.googleapis.com/drive/v2/files?token=<token> // Python:  from googleapiclient.discovery import build drive = build('drive'.'v2', credentials=credentials)
Copy the code

More specifically, the tokens above used to invoke the API are what we call subdivided into access tokens. Generally access tokens have an expiry date and need to be reacquired if they expire. So how do you get it back? Now let’s rewind the clock for a moment and review how the first token acquisition process worked:

  1. First you need to register your applications with the Google API. Once you have registered, you will get your credentials, including ids and secrets. Not all program types have secret.
  2. The next step is to ask Google for access tokens. We will ignore some details here, such as request parameters (secret is required, of course) and how different types of applications request. Importantly, if you want to access a user resource, the user is reminded to authorize it.
  3. If the user is authorized. Google will return the Access token. Or you can return the authorization Code, which you can use to get access tokens
  4. Once the token is obtained, you can access the API with the token

The process is as follows:

Note that in the third step of exchanging the Access token with code, Google will not only return the Access token, but also return additional information, of which the refresh token is relevant for subsequent updates

Once the Access Token expires, you can request the Access token again through refresh Token.

The above is just a rough outline of the process, and some additional concepts have been deliberately omitted. Depending on how you request and the type of resource you access, you can also update the Access token without the refresh token.

Two other questions arise:

  1. What if the Refesh Token also expires? This requires users to re-login authorization
  2. Why distinguish refresh token from Access token? What if they merged into one token and adjusted the expiration time longer, and the user re-logged in after each expiration? This question will be related to the concepts discussed later and will be answered later

OAuth

From obtaining the token to using the token access interface. This is actually the process of accessing the API under the standard OAuth 2.0 mechanism. In this section, we will talk about the concepts related to OAuth inside and out, and have a deeper understanding of the functions of tokens.

SSO (Single sign-on)

There are a lot of tools and platforms for people to use, such as HUMAN resources, code management, log monitoring, budget requests, etc. If each platform implements its own user system, it is undoubtedly a huge waste, so the company will have a set of public user system, as long as users log in, they can access all systems. That’s SSO: Single Sign-on.

SSO is a general term for a class of solutions, and in terms of implementation, we have two strategies to choose from: 1) SAML 2.0; 2) the 2.0. Let’s distinguish the differences between these two authorization methods.

But before describing the different strategies, let’s describe a few common and fairly important concepts.

Authentication VS Authorisation

  • Authentication: Indicates the identity Authentication
  • Authorisation: authorization

The role of authentication is to recognize you have access to the system, used to identify whether the visitors are legitimate users; Authorization is used to determine which resources you have access to. Most people won’t be able to tell the difference between the two because from the user’s standpoint. As a system designer, there is a difference between the two, which are two different responsibilities. We can only need the authentication function without authorization function, and even do not need to implement the authentication function by ourselves, but with the help of Google’s authentication system, that is, users can log in with Google accounts.

Authorization Server/Identity Provider(IdP) VS Service Provider(SP)/Resource Server

The services responsible for authentication are called Authorization Server or Identity Provider, hereinafter referred to as IdP; The Service responsible for providing resources (API calls) is called Resource Server or Service Provider, hereinafter referred to as SP

2.0 SMAL

The following is a flow chart of SMAL 2.0

  • Users who have not yet logged in open a browser and visit your site (SP). SP provides services but does not authenticate users.
  • The SP sends an SAML authentication request to the IdP, and the SP redirects the user’s browser to the IdP.
  • After verifying the request from SAML, IdP displays a login form in the browser for the user to fill in the username and password to log in
  • Once the user is logged in, IdP generates a SAML Token containing user information (user name or password). The IdP returns a token to SP and redirects the user to SP (the token return is implemented in the redirection step, as described below)
  • The SP validates the tokens and parses out user information, such as who they are and what permissions they have. At this point, users can access the content of our site based on this information

When a user successfully logs in to IdP, IdP needs to redirect the user to SP site again. This step is usually done in two ways:

  • HTTP Redirect: This is not recommended. Urls that should be redirected are limited in length and cannot carry longer information, such as SMAL tokens
  • HTTP POST request: This is a more conventional approach, when the user is logged in, a form is rendered, the user clicks on it and submits a POST request to SP. Or you can make a POST request to the SP using Javascript

If your application is Web-based, there is no problem with the above solution. But if you’re developing an iOS or Android mobile app, there’s a problem:

  • When a user opens an application on an iPhone, the user needs to authenticate through IdP
  • The application switches to Safari. After login authentication is complete, the token is returned to the mobile application through HTTP POST

While the URL of the POST can pull up the app, the mobile app cannot parse the content of the POST and we cannot read the SAML Token

Of course, there are still ways, such as in the IdP authorization stage does not jump to the system’s Safari browser, embedded in the WebView to solve, try to extract token from the Webview, or use the proxy server. But in any case, SAML 2.0 is not suitable for today’s cross-platform landscape. Perhaps it has something to do with its age. It came out in 2005, when HTTP POST was really the best solution

The 2.0

Let’s start with a brief overview of the OAuth 2.0 process under SSO.

  • A user wants to access resources on SP through a client (either a browser or a mobile application), but SP informs the user that authentication is required and redirects the user to IdP
  • IdP asks the user whether SP can access the user information. If the user agrees, IdP returns the Access code to the client
  • The client uses code to exchange access token with IdP, and requests resources from SP with access token
  • After receiving the request, the SP verifies the user’s identity to the IdP with the attached token

So how does OAuth prevent the SAML process from being able to parse the POST content? The way the user returns to the client from the IdP is through URL redirection. The URL allows custom schema, so the application can be pulled up even on the phone. On the other hand, because IdP passes code to clients, rather than XML information, code can easily be attached to a redirected URL for transmission

But the above SSO process does not reflect the original intention of OAuth. OAuth is designed to allow one application to allow another to access its data if the user authorizes it. OAuth is designed to favor authorization over authentication (of course authorizing user information indirectly does this), although Google’s OAuth 2.0 API supports both authorization and authentication. So when you log in to a third party site using your Facebook or Gmail account, you will get an authorization dialog to tell you what information the third party site can access, and you need to get your consent:

In the ABOVE SSO OAuth process, three roles are involved: SP, IdP and Client. However, Client may not exist in actual work. For example, if you write a back-end program to periodically pull the latest program data from Youtube through Google API, then your back-end program needs to be authorized by Youtube’s OAuth.

OAuth VS OpenId

If you’re paying attention, you’ll see some sites that allow you to log in as OpenID, which is basically a Facebook or Google account:

That sounds a lot like OAuth. But they’re essentially two different things for different users:

  • OpenID is only for Authentication, allowing you to log in to multiple sites with the same account. It simply endorses your legal identity, and when you log in to a site with your Facebook account, that site has no access to your Facebook data
  • OAuth is used for Authorisation, allowing the authorized party to access the user data of the authorized party

Refresh Token

Now we can answer the question from the first section of this article: Why do we need refresh Token?

This is done for the separation of responsibilities: Refresh Tokens are responsible for authentication and Access tokens are responsible for requesting resources. Although refresh token and Access token are both issued by IdP, the ACCESS token also needs to exchange data with SP. If they are shared, identity leakage may occur. And IdP and SP may be completely different services provided. In the first section we had no such concerns because IdP and SP are both Google

conclusion

This section focuses on OAuth and the difference between authentication and authorization. Now we can relate what we learned in the previous section and understand tokens better: Tokens actually serve OAuth as a key to access data. Let’s take a look at another form of this key: Json Web Token, or JWT

JWT

Perceptual knowledge

First, we need to get to know JWT emotionally. JWT is essentially a token, as we learned in section 1, which is a credential to access a resource

Some Of Google’s apis, such as Prediction API or Google Cloud Storage, do not require access to users’ personal data, so applications can access it directly without the user’s authorization. Just like in the previous section, there were no processes in OAuth that the Client did not participate in. This is accomplished with the help of JWT. The process is as follows

  • First, you need to create a Service account on the Google API.
  • Get the credential information for the service account, including the email address, client ID, and a pair of public/private keys
  • Create a signed JWT using the client ID and private key, and send this JWT to Google to exchange access tokens
  • Google returns the Access token
  • The program accesses the API through access token

You can even access apis with JWT as a bearer token in HTTP headers without having to ask Google for an access token. I think that’s the beauty of JWT

Rational knowledge

JWT is a JSON token. It consists of three parts: 1) header 2) Payload 3) signature

header

Header is used to describe meta information, such as the algorithm that generates signature:

{
    "typ": "JWT"."alg": "HS256"
}
Copy the code

The alg keyword specifies which hash algorithm to use to create the signature

payload

Payload is used to carry information that you want to send to the server. You can add official fields (fields) such as iss(Issuer), sub(Subject), exp(Expiration time), or custom fields (userId) :

{
    "userId": "b08f86af-35da-48f2-8fab-cef3904660bd"
}
Copy the code

signature

This is the signature.

To create a signature, perform the following steps:

  • You need to get the key from the interface server, let’s saysecret
  • willheaderBase64 encoding is performed, assuming that the result isheaderStr
  • willpayloadBase64 encoding is performed, assuming that the result ispayloadStr
  • willheaderStrandpayloadStrwith.Strings are assembled into charactersdata
  • In order todataandsecretAs a parameter, a hash algorithm is used to calculate the signature

If the above description is not intuitive, it can be expressed in pseudocode:

// Signature Algorithm data = base64urlEncode(header) + ". + base64urlEncode( payload ) signature = Hash( data, secret );Copy the code

Suppose our original JSON structure looks like this:

// Header
{
  "typ": "JWT",
  "alg": "HS256"
}
// Payload:
{
  "userId": "b08f86af-35da-48f2-8fab-cef3904660bd"
}
Copy the code

If the key is the string secret, then the final JWT result looks like this

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM
Copy the code

You can verify the results on jwt. IO

What does JWT bring to the table

The purpose of JWT is not to hide or keep the data secret, but to ensure that the data is actually created by an authorized person (and not tampered with)

Remember, when you get JWT, you can decode headers and payloads without secret, because headers and payloads are just encoded in Base64, The purpose of coding is to facilitate the transmission of data structures. Although the process of creating signature is similar to encrypt, it is essentially an act of signing to ensure data integrity and does not actually encrypt any data

For the differences between Encoding, Encryption, and Hashing, see this article: Encoding vs. Encryption vs. Hashing

For interface calls

The JWT can then be attached to the API call (usually in an HTTP Header). And because SP shares a secret with the application, the back end can verify that the signature is correct using the same hash algorithm provided by the header to determine whether the application has the right to call the API

Stateful conversations

Because HTTP is stateless, the client and server need to figure out how to make the conversation stateful. For example, only logged-in users have the permission to invoke certain interfaces. After logging in, you need to remember that the user is logged in. A common approach is to use the session mechanism

The common session model works like this:

  • After the user logs in to the browser, the server generates a unique session ID for the user, which is stored in the storage service (such as MySql and Redis) on the server
  • The session ID is also returned to the browser and stored in the browser cookie with the SESSION_ID KEY
  • If the user visits the site again, the SESSION_ID in the cookie is sent to the server along with the request
  • The server determines whether the user is logged in by checking whether SESSION_ID is already in Redis

As you may be aware, JWT could theoretically replace the session mechanism. Users do not need to log in in advance, and the backend does not need Redis to record user login information. The client saves a valid JWT locally. When the user needs to call the interface, the legal JWT is attached. Each time the interface is called, the backend uses the JWT attached to the request to verify the validity. This also indirectly achieves the authentication user’s purpose

But can JWT really replace the session mechanism? What are the pros and cons of doing this? We’ll leave these questions for the next article

This article is also published on my Zhihu column, welcome your attention

The resources

Google API

  • Developers.google.com/identity/pr…
  • Developers.google.com/identity/pr…

JWT

  • Medium.com/vandium-sof…
  • Dzone.com/articles/co…
  • Auth0.com/blog/ten-th…
  • Stackoverflow.com/questions/3…

Refresh Token

  • Stackoverflow.com/questions/3…
  • Auth0.com/blog/refres…
  • Auth0.com/learn/refre…

Token VS Cookie

  • Dzone.com/articles/co…
  • Stackoverflow.com/questions/1…

Oauth

  • Stackoverflow.com/questions/4…
  • Gist.github.com/mziwisky/10…
  • www.quora.com/How-does-OA…
  • www.digitalocean.com/community/t…

OpenID VS Oauth

  • Stackoverflow.com/questions/1…
  • Softwareas.com/oauth-openi…
  • Spin.atomicobject.com/2016/05/30/…

SAML VS Oauth

  • www.ubisecure.com/uncategoriz…
  • www.mutuallyhuman.com/blog/2013/0…
  • Spin.atomicobject.com/2016/05/30/…