In an interview with some Internet companies, the interviewer will ask the following question:

“If browser cookies are disabled, how do I track and authenticate users?”

Unfortunately, there are still a large number of candidates who don’t know the difference between cookies and sessions. However, there are also surprising real cases in work: the user ID was stored in local storage as a token, because they claimed to abandon cookies, such backward things; A mobile project that requires the client to simulate a cookie in the API given by the server to consume the API like ajax in the browser.

The Internet was built on THE HTTP protocol, which became popular for its simplicity, but HTTP was stateless (virtual circuits were much more expensive than datagrams at the communication level), so people came up with all sorts of ways to keep track of users. These include cookie/session mechanisms, tokens, Flash cross-browser cookies, and even browser fingerprints.

Hiding the user’s identity everywhere (browser fingerprinting doesn’t even require storage)

Much has been written about the use of specific technologies such as Spring Security, and this article is not intended to cover the implementation of the framework and code. We’ll discuss the difference between authentication and authorization, then introduce some of the most widely used techniques in the industry, and finally talk about how to choose the right authentication approach for your API build.

Authentication, authorization, credentials

First of all, authentication and authorization are two different concepts, and in order to make our apis more secure and cleanly designed, it is important to understand the difference between authentication and authorization, which are also different words in English.

Authentication refers to the identity of the current user. After the user logs in, the system can track his identity and perform operations in line with the corresponding business logic. Most systems track the user’s identity even if he or she is not logged in and treat him or her as a guest or anonymous user. Authentication technology addresses the question “Who am I? The problem.

Authorization is different. Authorization refers to authorization, which refers to the identity that is allowed to access certain resources. After obtaining the user’s identity, the user’s permission is checked. Authorization of a single system is often accompanied by authentication, but in a multi-system architecture with open apis, authorization can be done by different systems, such as OAuth. Empowerment techniques address the question “What can I do?” The problem.

The foundation of authentication and authorization is the need for media to mark the identity and rights of visitors. In real life, everyone needs an ID card to access their bank account, get married, and handle endowment insurance. These are the certificates of authentication. In ancient military activities, the emperor would issue amulets to the generals who went to battle. The subordinate generals did not care about the people who held the amulets and only needed to execute the corresponding commands. In the Internet world, the server issues a session ID to each visitor and stores it in a cookie, which is a credential technology. Digital credentials also show up in many ways, SSH login keys, JWT tokens, one-time passwords, and so on.

A user account is not necessarily a table stored in a database. In some enterprise IT systems, there are more requirements on account management and permissions. So accounting can help us manage user accounts in different ways and have the ability to share accounts between different systems. Examples include Microsoft’s Active Directory (AD), simple Directory Access Protocol (LDAP), and even blockchain technology.

Another important concept is access control policy (AC). If we need to divide the permissions of resources to a very fine granularity, we have to consider the identity of the user to access the restricted resources, whether to choose acL-based access control or role-based access control (RBAC) or other access control policies.

In popular technologies and frameworks, these concepts can not be implemented in isolation, so when using these technologies in reality, people often argue about the concept of an OAuth2 authentication or authorization. For ease of understanding, I’ve included a glossary of common technologies and concepts at the end of the article. I’ll look at several AUthentication and authorization techniques that are commonly used in API development: HTTP Basic AUthentication, HAMC, OAuth2, and the credentials technology JWT Token.

HTTP Basic Authentication

You must have used this method, but you don’t necessarily know what it is. Not long ago, when you visited the admin interface of a home router, you would often see a browser popup form asking you to enter a user password.

Behind the scenes, when the user enters the username and password, the browser does something very simple for you:

  1. Combine username and password and then Base64 encoding

  2. Prefix the encoded string with Basic, and then set the header with the name Authorization

The API can also be very simple to provide HTTP Basic Authentication, so the client can simply Base64 to transfer the user name and password:

  1. Connect the username and password with colons, for example, username:abc123456
  2. To prevent the user name or password from having characters beyond the ASCII range, utF-8 is recommended
  3. Encode the above string using Base 64, for example
  4. Add Basic + Encoded string in the HTTP request header, that is, Authorization: Basic QWxhZGRpbjpPcGVuU2VzYW1l

This approach is very simple to implement and is used in a large number of scenarios. Of course, the drawback is obvious, Base64 can only be called encoding, not encryption (in fact, clients that don’t need to configure keys don’t have any reliable encryption, and we all rely on the TSL protocol). The fatal weakness of this method is that the encoded password is easily leaked during network transmission if it is transmitted in plain text. If the password does not expire, you can only change the password once the password is leaked.

HMAC (AK/SK) authentication

When we connect to some PASS platforms and payment platforms, we will be required to generate an Access key (AK) and secure key (SK) in advance, and then complete the authentication request by signing, which can avoid the transmission of secure key. In most cases, the signature can be used only once to avoid replay attacks.

This AK/ sk-based authentication mode mainly uses hashed MessageAuthentication Code to achieve, so there are many places called HMAC authentication, actually is not very accurate. HMAC simply uses a hash algorithm with a key value to generate a message digest, which was implemented differently when the API was designed.

HMAC is used as a credential generation algorithm in the authentication design of network communication to avoid the transmission of sensitive information such as passwords in the network. The basic process is as follows:

  1. Clients need to set access Key (AK or APP ID) and Secure Key (SK) in advance in the authentication server.
  2. When calling the API, the client naturally sorts the parameters and the Access key and uses the Secure Key to sign to generate an additional parameter digest
  3. The server performs the same summary calculation based on the pre-set Secure key and requires the exact same results
  4. Note that secure keys cannot be transferred over the network and stored in an untrusted location (browser, etc.)

To make each request signature unique for a replay attack, we need to put some interference in the signature.

There are two typical practices in industry standards, the challenge/response algorithm (OCRA: OATH challenged-response Algorithm and TOTP: time-based one-time Password Algorithm.

Challenge/reply algorithm

The challenge/answer algorithm requires the client to request the server once, get a 401 unauthenticated return, and get a random string (nonce). Attach nonce to HMAC signature as described above. The server also performs signature verification using a pre-assigned Nonce, which is used only once on the server and thus provides a unique digest.

One-time password authentication based on time

In order to avoid additional requests to obtain the Nonce, another algorithm uses a timestamp and synchronizes the time to reach a consensus within a certain time window (about 1 minute).

Here, the timestamp is only used as the time window of authentication, and can not be strictly counted as a one-time password algorithm based on time. Standard time-based one-time cryptography algorithms are widely used in two-step authentication, such as Google Authenticator, which does not require network communication (but relies on accurate timing services). The principle is that the client server can share the key and then calculate the same captcha through the HMAC algorithm based on the time window.

TOTP fundamentals and common vendors

OAuth2 and Open ID

OAuth (Open Authorization) is an open standard that allows users to authorize third party websites to access their information stored on another service provider without having to provide a user name and password to third party websites or share all the content of their data.

OAuth is an authorization standard, not a certification standard. The server providing the resource does not need to know the exact user identity (session), but only needs to verify the permissions (tokens) granted by the authorization server.

The above figure is just a simplified process for OAuth. The basic idea of OAuth is to obtain access tokens and refresh tokens through authorized servers. The data is then retrieved from the resource server using the Access Token. There are also several patterns for specific scenarios:

  1. Authorization Code
  2. Simplified patterns (Implicit)
  3. Password mode (Resource owner Password Credentials)
  4. Client credentials

If you need to obtain user authentication information, OAuth itself does not define this part of the content. If you need to identify user information, you need to use another authentication layer, such as OpenID Connect.

Verify the access token

Very little is said in the OAuth blog about how the resource server validates access tokens. The OAuth Core standard does not define this section, but there are two ways to validate access tokens mentioned in other OAuth standards documents.

  1. After completing the authorization process, the resource server can use the Introspection interface provided by the OAuth server to verify the Access token. The OAuth server returns the status and expiration time of the Access token. The term for validating tokens in the OAuth standard is Introspection. It should also be noted that the Access token is a credential between the user and the resource server, not between the resource server and the authorization server. Additional authentication (such as Basic authentication) should be used between the resource server and the authorization server.
  2. Use JWT validation. The authorization server uses the private key to issue the ACCESS token in JWT form. The resource server needs to verify the JWT token with the pre-configured public key to obtain the token status and some information contained in the Access token. Therefore, under THE JWT scheme, the resource server and authorization server no longer need to communicate, leading to a huge advantage in some scenarios. JWT also has some weaknesses, which I will explain in the JWT section.

Refresh token and Access token

Almost everyone has a question when they first get to know OAuth: Why do you need refresh tokens when you already have access tokens?

The authorization server returns both the Access token and the Refresh token on the first authorization request, and only the refresh token is required to refresh the Access token later. The access token is designed for interaction between a client and a resource server, while the Refresh token is designed for interaction between a client and an authorization server.

In some authorization modes, The Access token needs to be exposed to the browser, acting as a temporary session between the browser and the resource server. There is no signature mechanism between the browser and the resource server, and the Access token becomes the only certificate. Therefore, the TTL of the Access token should be as short as possible to prevent the access token from being sniffed.

Because the Access token is required for a short period of time, the Refresh token helps users maintain a state for a long time, avoiding frequent reauthorization. Don’t you think keeping access tokens expire for a long time is enough? In fact, the difference between a Refresh token and an Access token is that even if a Refresh token is intercepted, the system remains secure, When a client uses the Refresh Token to obtain the Access token, a pre-configured secure key is required. The client and the authorization server are always authenticated.

OAuth, OpenID, OpenID Connect

There was so much authentication jargon that I sometimes couldn’t understand it until the last minute when I was building my own authentication server or connecting to a third-party authentication platform.

OAuth is responsible for resolving authorization issues between distributed systems, even if the client and the resource server or authentication server are sometimes on the same machine. OAuth does not solve the problem of certification, but provides a good design to facilitate the docking of existing certification systems.

Open ID addresses the problem of identity authentication between distributed systems. Using Open ID token can authenticate users between multiple systems and return user information. It can be used independently and has no association with OAuth.

OpenID Connect solves the problem of user authentication under the OAuth system. The basic principle of implementation is to treat user authentication information (ID token) as resources. After authorization is completed in the OAuth framework, the user’s identity is obtained through access token.

The relationship between these three concepts is a little difficult to understand. In a realistic scenario, if an independent authentication system is required in a system, Open ID can be directly used without authorization between multiple systems. If OAuth is used as the authorization standard, the user can be authenticated through OpenID Connect.


Under OAuth and other distributed authentication and authorization systems, there are more requirements on credential technology, such as user ID, expiration and other information, which do not need to be associated in external storage. Therefore, the industry has further optimized the token and designed a self-contained token. After the token is issued, the expiration and validity information of the token can be obtained by parsing the token, without checking whether the token is valid from the server storage. This is JWT (JSON Web Token).

JWT is a self-contained token, or value token. We used to use hash values associated with a session called reference tokens.

In short, a basic JWT token is a three-segment structure for a segment of points.

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9.TJVA95OrM7E2cB ab30RMHrHDcEfxjoYZgeFONFh7HgQ

The process for generating JWT tokens is as follows

  1. The Base64 encoding of header JSON is the first part of the token
  2. The base64 encoding of payload JSON is the second part of the token
  3. Assemble the json encoded in parts 1 and 2 and the third part of the token signed with secret

Therefore, only the signed Secret key is required to verify the JWT token. If the user ID and expiration information are added to the message body, the token can be verified to be valid and expired without reading the information from the database/cache. Because an encryption algorithm is used, the first and second parts cannot be verified even if they are modified (including expiration information). The advantage of JWT is that it can not only be used as a token, but also carry some necessary information, saving multiple queries.


  1. The first and second parts of the JWT token are just Base64 encoding, unreadable to the naked eye and should not store sensitive information
  2. JWT tokens are self-contained and cannot be withdrawn
  3. The JWT signature algorithm can be designed by itself. For easy debugging, symmetric encryption algorithm can be used in the local environment, and asymmetric encryption algorithm is recommended in the production environment

JWT Token is particularly advantageous in microservice systems. JWT token can be directly passed in the API of multi-layer invocation, and the ability of self-inclusion can reduce the number of user information query. More importantly, using asymmetric encryption can validate JWT tokens by distributing the key across the system.

Of course, OAuth does not restrict the technology used for access tokens and other certificates. OAuth does not force the use of JWT. When using the advantages of JWT self-contained feature, the difficulty of JWT withdrawal must be taken into account. JWT is not suitable for some projects with high requirements for token withdrawal, and even the implementation of some schemes (Whitelist and Blacklist) violates the original intention of JWT design.

Cookie, Token in Cookie, and Session Token are still used

When building an API, developers will find that our authentication approach differs a little from that of web applications. In addition to typical Web technologies like Ajax, cookies are not recommended if we want the API to be stateless.

The essence of using cookies is that the server will assign a Session ID to users when they visit for the first time, and the client will carry this ID as the symbol of the current user in subsequent requests. Because HTTP itself is stateless, cookies are a way to achieve state built in the browser. If our API is intended for use by clients, forcing the API caller to manage cookies would also do the job.

In some projects with legacy or non-standard certification implementations, we can still see these practices, quickly implementing certification.

  1. Use cookies, such as ajax, in web projects
  2. Use the session ID or hash as the token, but pass the token in the header
  3. The generated token (possibly JWT) is passed in a cookie, which is protected with HTTPonly and Secure tags

Select an appropriate authentication mode

With the development of micro services, API design is not only for WEB or Mobile apps, but also for authentication of BFF (Backend for Frontend) and Domain API, and integration of third-party services.

Client-to-server authentication is different from server-to-server authentication.

We call the communication that the end user (Human) participates in human-to-machine (H2M), and the communication between servers machine-to-machine (M2M).

H2M communication requires higher security, M2M communication is naturally more secure than H2M, so more emphasis on performance, in different situations to choose the appropriate authentication technology is particularly important. HTTP Basic Authentication, for example, is a bit of a backwater for H2M Authentication, but is heavily used in M2M.

It is also worth mentioning that in H2M communication mode, the client is not under control. As the key cannot be distributed independently, the security of authentication communication is highly dependent on HTTPS.

Looking at their relationship from a macro point of view is very helpful to our technology selection.

The glossary

  1. Browser Fingerprinting queries the Browser’s proxy string, screen color depth, language, etc. These values are then passed through a hash function to create a fingerprint that does not require a Cookie to identify the Browser
  2. Message Authentication Code (MAC) In cryptography, a Message authentication code is a piece of information generated by a specific algorithm to check the integrity of a Message
  3. HOTP (HMAC-based One-Time Password Algorithm) A one-time Password algorithm based on hash message verification codes
  4. Two-step verification is a verification method that uses Two different elements, combined together, to confirm the user’s identity, and is a special case of multi-factor verification
  5. One Time Password (OTP) One-time password, such as the authentication code in the registration email or SMS

Refer to the article

Swagger. IO/docs/specif…

[HMAC: Keyed-hashing for Message Authentication](“… “)

HOTP: An HMAC-Based One-Time Password Algorithm

OCRA: OATH Challenge-Response Algorithm

The OAuth 2.0 Authorization Framework

JSON Web Token (JWT)

The 2.0

Internet-Draft Archive for OAuth

The text/ThoughtWorks’ Mr. Rynning

For more insights, please follow our wechat account ThoughtWorks Insights