preface

Usually in order to understand one concept, we need to know ten concepts. Before deciding whether JWT(JsonWebToken) can replace session management, we need to understand what a token is and the difference between an Access token and a Refresh token.

Understand what OAuth is, what SSO is, the difference between OAuth and SAML under SSO, and the difference between OAuth and OpenID, and more importantly, distinguish authorisation and authentication.

Finally, we introduce JSON WEB Tokens, talk about JWT’s strengths and weaknesses in Session management, and try to address these weaknesses to see what the costs and costs are.

The body of the

Examples of OAuth authorization and API calls in this article are from the Google API.

About the Token

Tokens have different definitions even in the computer world, and when we say tokens, we mean credentials to access resources. For example, when you call the Google API, you need to bring a valid token to indicate that your request is valid. This Token is given to you by Google, which gives you access to the resources behind the API.

There are many ways to carry a token when requesting an API, either through HTTP headers or URL arguments or through the Google libraries:

  • HTTP Header
GET /drive/v2/files HTTP/1.1

Authorization: Bearer <token>
Host: www.googleapis.com/
Copy the code
  • The URL parameter
GET https://www.googleapis.com/drive/v2/files?token=<token>
Copy the code
  • Python libraries
from googleapiclient.discovery import build
drive = build('drive'.'v2', credentials=credentials)
Copy the code

More specifically, the tokens above used to invoke the API are what we call subdivided into Access tokens. Generally access tokens have an expiry date and need to be reacquired if they expire. So how do you get it back? Let’s take a look at the first token acquisition process:

  1. When you register an application with the Google API, you get the credentials, including ids and secrets. Not all program types have secret.

  2. The next step is to ask Google for access tokens. Ignore some details, such as the request parameters (secret, of course). Importantly, if you want to access a user resource, the user is reminded to authorize it.

  3. If the user is authorized. Google will return the Access token. Or you can return the authorization Code and obtain the Access token through the code.

Once the token is obtained, you can access the API with the token.

The process is as follows:

Note: In the third step of exchanging access tokens through the authorization code, Google will not only return access tokens, but also return additional information, of which the refresh token is relevant for subsequent updates.

Once the Access Token expires, you can request the Access token again through refresh Token.

The above is just a rough outline of the process, and some additional concepts have been deliberately omitted. Depending on how you request and the type of resource you access, you can also update the Access token without the refresh token.

Two other questions arise:

  1. What if the Refesh Token also expires? In this case, the user needs to re-log in and authorize.

  2. Why distinguish refresh token from Access token? What if they merged into one token and adjusted the expiration time longer, and the user re-logged in after each expiration? This question is related to the concepts discussed later and will be explained later.

The 2.0

From obtaining the token to using the token access interface. This is actually the process of accessing the API under the standard OAuth2.0 mechanism. Here is an introduction to OAuth inside and out related concepts, a deeper understanding of the role of token.

SSO (Single sign-on)

There are a lot of platforms for people to use internally, such as HUMAN resources, code management, log monitoring, budget requests, etc. If each platform implements its own user system, it is undoubtedly a huge waste, so the company will have a set of public user system, as long as users log in, they can access all systems. This is single sign-on.

SSO is a general term for a class of solutions, and in terms of implementation, we have two strategies to choose from:

  • SAML 2.0

  • The 2.0

Let’s distinguish the differences between these two authorization methods. But before describing the different strategies, let’s describe a few common features, and rather important concepts.

Authentication VS Authorisation

  • Authentication: Identity Authentication.

  • Authorisation: Resource access authorization.

The function of authentication is to allow you to access the system, used to identify whether the visitors are legitimate users; Authorization is used to determine which resources you have access to.

Most people won’t be able to tell the difference between the two because from the user’s standpoint. As a system designer, there is a difference between the two, they are two different job responsibilities. We can just need authentication without authorization, or even implement authentication ourselves. With Google’s authentication system, users can log in using a Google account.

Authorization Server/Identity Provider(IdP)

The service responsible for authentication is called AuthorizationServer or IdentityProvider, hereinafter referred to as IDP.

Service Provider(SP)/Resource Server

The service responsible for providing resources (API calls) is called ResourceServer or ServiceProvider (SP).

SAML 2.0

The following is a flow diagram of SAML2.0.

  1. An unlogged user opens a browser and visits your site (SP), which provides services but does not authenticate users.

  2. The SP sends an SAML authentication request to the IDP, and the SP redirects the user’s browser to the IDP.

  3. After verifying the request from SP, IDP presents a login form in the browser for the user to fill in the user name and password for login.

  4. Once the user is logged in, IDP generates a SAML Token (also known as A SAML Assertion, essentially an XML node) containing user information (user name or password). The IDP returns the token to SP and redirects the user to SP (the token return is implemented in the redirection step, as described below).

  5. SP verifies the token and parses the user information, such as who the user is and what permissions the user has. This allows users to access the content of our site based on this information.

When a user successfully logs in to IDP, IDP needs to redirect the user to SP site again. This step is usually done in two ways:

  • HTTP redirection: This is not recommended, as redirected urls are limited in length and cannot carry longer information, such as SAML tokens.

  • HTTP POST request: This is a more conventional approach, when the user is logged in, a form is rendered, the user clicks on it and submits a POST request to SP. Or you can make a POST request to the SP using JavaScript.

If your application is Web-based, there is no problem with the above solution. But if you’re developing an iOS or Android mobile app, there’s a problem:

  1. When a user opens an application on an iPhone, the user needs to authenticate through IDP.

  2. The application switches to Safari. After login authentication is complete, the token is returned to the mobile application through HTTP POST.

While the URL of the POST can pull up the app, the mobile app cannot parse the content of the POST and we cannot read the SAML Token.

Of course, there are still ways, such as not switching to the Safari browser of the system during the IDP authorization stage, solving the problem in the embedded Webview, trying to extract token from the Webview, or using the proxy server.

In any case, SAML 2.0 is not a good fit for today’s cross-platform landscape, which may have something to do with its age. It came out in 2005, when HTTP POST was really the best option.

The 2.0

Let’s start with a brief overview of the OAuth2.0 process under SSO.

  1. A user wants to access resources on SP through a client (either a browser or a mobile application), but SP informs the user that authentication is required and redirects the user to IDP.

  2. The IDP asks the user if SP can access user information. If the user agrees, the IDP returns the authorization code to the client.

  3. The client obtains the authorization code and exchanges the Access token with IDP, and requests resources from SP with the Access token.

  4. After receiving the request, the SP takes the attached token and verifies the user’s identity to the IDP. After the SP confirms the identity, the SP allocates resources to the client.

So how does OAuth prevent the SAML process from being able to parse the POST content?

  • On the one hand, the way the user returns to the client from the IDP is also through URL redirection, where the URL allows custom schema, so the application can be pulled up even on the phone.

  • On the other hand, because IDP transmits authorization code to the client rather than XML information, code can be easily attached to the redirected URL for transmission.

But the above SSO process does not reflect the original intention of OAuth. The idea behind OAuth is that one application allows another to access its data with the user’s permission.

OAuth is designed to be more about authorization than authentication (of course, authorized user information indirectly does this), although Google’s OAuth 2.0 API supports both authorization and authentication. So when you log in to a third-party site using your Facebook or Gmail account, an authorization dialog box appears telling you what information the site can access and that you need to ask for your permission.

In the ABOVE SSO OAuth process, three roles are involved: SP, IDP and Client. However, Client may not exist in actual work. For example, if you write a back-end program to periodically pull the latest program data from Youtube through Google API, then your back-end program needs to be authorized by Youtube’s OAuth.

OAuth VS OpenId

If you’re paying attention, you’ll see some sites that allow you to log in as OpenID, which is basically a Facebook or Google account:

OpenID is very similar to OAuth. But they are fundamentally two very different things:

  • OpenID: For Authentication only, it allows you to log in to multiple sites with the same account. It simply endorses your legal identity, and when you log in to a site with your Facebook account, that site has no access to your Facebook data.

  • OAuth: Used for Authorisation, allowing the authorized party to access the user data of the authorized party.

Refresh Token

Now you can answer the above question, why do we need refresh Token?

This is done for the separation of responsibilities:

  • Refresh Token: Responsible for identity authentication;

  • Access Token: Requests resources.

Although refresh token and Access token are both issued by IDP, the ACCESS token also needs to exchange data with SP. If they are shared, identity leakage may occur. And IDP and SP may be completely different services provided. As mentioned above, we have no such concerns because IDP and SP are Both Google.

JWT

A preliminary understanding

JWT is essentially a token; as we mentioned above, it is a credential to access a resource.

Some of Google’s apis, such as the Prediction API or Google Cloud Storage, do not require access to users’ personal data. Therefore, the application can be accessed directly without the user’s authorization. Just like there is no process in OAuth above that the Client does not participate in. This is accomplished with the help of JWT. The specific process is as follows:

  1. The first step is to create a Service Account on the Google API.

  2. Get the credential information for the service account, including the email address, client ID, and a pair of public/private keys.

  3. Create a signed JWT using the Client ID and private key, and send this JWT to Google to exchange access tokens.

  4. Google returns the Access token.

  5. The program accesses the API through access token.

You can even access apis with JWT as a bearer token in HTTP headers without having to ask Google for an access token. That’s the beauty of JWT.

Rational knowledge

JWT, as its name implies, is a TOKEN for JSON structure and consists of three parts:

  • header

  • payload

  • signature

header

Header is used to describe meta information, such as the algorithm that generates signature:

{    
    "typ": "JWT"."alg": "HS256"
}
Copy the code

The alg keyword specifies which hash algorithm to use to create the signature.

payload

Payload is used to carry information that you want to send to the server. You can either add official fields such as iss(Issuer), sub(Subject), exp(Expirationtime), or insert custom fields such as userId:

{
    "userId": "b08f86af-35da-48f2-8fab-cef3904660bd"
}
Copy the code

signature

Creating a signature involves the following steps:

  1. Get the key, say secret, from the interface server.

  2. Base64 encoding the header, assuming the result is headerStr.

  3. Payload is base64 encoded. Assume that the result is payloadStr.

  4. Combine headerStr and payloadStr with the. Character to form the character data.

  5. With data and secret as parameters, the signature is calculated by hashing algorithm.

If the above description is not intuitive, it can be expressed in pseudocode:

// Signature Algorithm data = base64urlEncode(header) + ". + base64urlEncode( payload ) signature = Hash( data, secret );Copy the code

Suppose our original JSON structure looks like this:

// Header
{
    "typ": "JWT",
    "alg": "HS256"
}

// Payload
{
    "userId": "b08f86af-35da-48f2-8fab-cef3904660bd"
}
Copy the code

If the key is the string secret, the final JWT result looks like this:

eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM
Copy the code

You can verify this result on jwt. IO.

What does JWT bring to the table

Ensure data integrity

The purpose of JWT is not to hide or keep the data secret, but to ensure that the data is actually created by an authorized person to prevent mid-stream tampering.

Remember, when you get JWT, you can decode headers and payloads without secret, because headers and payloads are just encoded in Base64, The purpose of coding is to facilitate the transmission of data structures.

Although the process of creating signature is similar to encrypt, it is essentially an act of signing to ensure data integrity and does not actually encrypt any data.

For interface calls

The JWT can then be attached to the API call (usually in an HTTP Header). And because SP shares a secret with the application, the application can use the same hash algorithm provided by the header to verify that the signature is correct, thereby deciding whether the application has the right to call the API.

Stateful Session

Because HTTP is stateless, the problem for the client and server is how to make their conversations stateful. For example, only logged-in users have the permission to invoke certain interfaces. After logging in, you need to remember that the user is logged in. A common approach is to use the session mechanism.

The common session model works like this:

  1. After the user logs in to the browser, the server generates a unique session ID for the user, which is stored in the storage service (such as MySQL and Redis) on the server.

  2. The session ID is also returned to the browser and stored in the browser cookie with the SESSION_ID KEY.

  3. If the user visits the site again, the SESSION_ID in the cookie is sent to the server along with the request.

  4. The server determines whether the user is logged in by checking whether SESSION_ID is already in Redis.

As you may be aware, JWT could theoretically replace the session mechanism. Users do not need to log in in advance, and the backend does not need Redis to record user login information. The client saves a valid JWT locally. When the user needs to call the interface, the legal JWT is attached. Each time the interface is called, the backend uses the JWT attached to the request to verify the validity. This also indirectly achieves the authentication user’s purpose.

But can JWT really replace the session mechanism? What are the pros and cons of doing this? These questions will be left for the next article.


Welcome to pay attention to the technical public number: Zero one Technology Stack

This account will continue to share learning materials and articles on back-end technologies, including virtual machine basics, multithreaded programming, high-performance frameworks, asynchronous, caching and messaging middleware, distributed and microservices, architecture learning and progression.