Rambling on multi-tenant design for enterprise SaaS

The enterprise SaaS market has seen players emerge in each segment in recent years. From a technical point of view, different SaaS offerings in different domains have the same architectural kernel, the most critical of which is support for multi-tenancy. For the majority of enterprises, the introduction of SaaS products is essentially the lease of Internet services, so multi-tenant is one of the natural attributes of SaaS, and one of the important differences between it and the traditional Internet application architecture design. In the maturity evolution of SaaS architecture, the core path is how to achieve multi-tenancy, that is, the maturity of SaaS is largely determined by how to achieve multi-tenancy support.

A core focus of multi-tenant technology

At present, there is no established specification for the technical implementation of multi-tenancy. Not only are there many details, but there are various ways to implement each detail. On the one hand, it depends on the current R&D team’s existing technology reserves, technology selection, team capital strength, industry or customer characteristics (for example, the financial industry will have higher requirements for data security), on the other hand, it is closely related to the current technology development, the rise of cloud manufacturers and the arrival of the era of cloud origin. It also has a profound impact on the way software is built, including SaaS.

But in general, a true SaaS application needs two things:

A single instance
multi-tenant

Single instance means sharing at the system resource level, and multi-tenancy means isolation at the application logic level. So how to balance these two points is the core focus of multi-tenant design for SaaS applications.

The classic distributed service architecture naturally solves the three high problems of Internet application (high concurrency, high performance, high availability), which is also the problem that enterprises will face in the later stage of the development of SaaS. Let’s analyze how to design and implement multi-tenant SaaS application under this architecture.

Implementation of multi-tenancy

From the perspective of resource sharing, from Share Nothing to Share Everything, multi-tenancy can be supported at any point in the scale. But as we mentioned earlier, the first priority of SaaS architecture is the single instance, and only the single instance can reduce the cost as much as possible, and the product can have economies of scale. Therefore, the so-called sharing and isolation, in the classic architecture, will focus on how to separate different tenants at the resource level.

Iii. Resources

When we think of resources, we might think of CPU, memory, disk, network bandwidth, etc., but with so many types of resources, the characteristics can be grouped into two categories, namely storage resources and computing resources.

In other words, SaaS systems can also be considered a convergence of distributed storage and distributed computing in their technical nature.

In a multi-tenant implementation, the processing of storage resources is often more critical, and computing resources are generally considered only when necessary, which I think is mainly related to the “stateful” nature of storage. Let’s take a look at some typical scenarios for how to approach multi-tenancy design.

4 Isolation of storage resources

Isolation of storage resources can be summarized in one word: namespace. Taking the database as an example, we only need to record the identity of the corresponding tenant on each record of the tenant.

In general, we logically store data for all tenants in the same Schema, regardless of database and table partitioning. This requires each table to have a TENant_ID field, meaning that each record carries its “namespace” — the tenant id.

Take the common NoSQL solution Redis as an example. Generally speaking, all tenant data is stored in the same distributed cluster, so it is obvious to carry the tenant id on the key.

So no matter what kind of storage, the idea is the same, and processing is relatively simple and rough. But what I want to emphasize here is that at the engineering level we need to unify these conventions within the underlying framework.

For example, all SQL statements in the tenant context should carry where tenant_id=? It’s hard to imagine everyone remembering this rule all the way from zero to a hundred thousand or a million lines of code.

In similar scenarios, we can use AOP technology to cut out multi-tenant related logic for unified processing. For example, in Java, we can define @TenantContextaware annotation to do the corresponding tenant information acquisition and delivery processing in a declarative rather than coding way where needed.

How can we ensure that developers keep this rule in mind? Since multi-tenancy is a natural feature of SaaS, we can do the opposite and support multi-tenancy logic by default, while defining the @TenantContextunAware annotation to declare exceptions where multi-tenancy is not required, greatly reducing the burden on the development team.

Similarly, it is recommended to define a unified KeyGeneratePolicy for Redis Key maintenance.

Isolation of computing resources

Affinity is also used to isolate computing resources. In short, affinity is designed between tenants and cluster computing resources.

In addition to the “state” difference between computing and storage, there is another very important difference. The financial cost of computing is often much higher than that of storage. For example, we may only allow hundreds of threads to process requests simultaneously on a single virtual host.

Because of this, precious computing resources are typically no longer isolated in fine granularity when they are not necessary, such as when we don’t allow requests from one tenant to be submitted to a specific worker thread at run time.

On the other hand, the consequence of computing resource skewing is often much more serious than that of storage, like the bucket effect, which directly and significantly affects the service capacity of the entire cluster.

But coarser – grained isolation is sometimes necessary for specific scenarios. For example, to reduce the scope of the tenant’s impact in the event of a system failure, we might hash the tenant’s request and submit it to a different thread pool for processing, because in this case, the backpressure will have a global impact.

In addition, we may also perform process and cluster level isolation in certain scenarios. In general, isolation of computing resources is not recommended unless it is absolutely necessary. There is no set pattern or routine, and it often requires high level of resource operation.

Similarly, if it must be implemented, it should be done in a componentized manner to ensure the purity of the business logic.

With the above isolation of storage and computing resources, our overall SaaS architecture would look like this.

Here is a table to make a simple comparison of the two methods with a few points, so that you can understand them more intuitively.

Extension of single instance architecture

Enterprise-oriented SaaS services often have features that may lead to higher-order requirements that are sometimes not fully met by a standalone single-instance architecture. At this point, it is necessary to extend the original architecture, with the overall isolation at the instance level, combined with the request diversion at the tenant level, to bring SaaS resources, software version and other aspects of isolation.

However, it is important to note that the extension of the single-instance architecture does not reduce its architectural maturity and does not conflict with the single-instance architecture concept that we have been emphasizing in this article.

For example, we tend to classify the guarantee level according to the scale and characteristics of enterprise customers, so how to further reasonably isolate resources and ensure the use experience of different levels of customers is also an inescapable problem.

In this case, we can consider implementing special protective isolation of some of the resources of such customers, or we can extend the single-instance architecture to multi-instance architecture, diverting customers into different protection level resource pools.

If there are individual customer volume than other customers, so in the case of costs allows, we might even consider construction for its exclusive resource pool, to focus on security, this level of protection does not mean sacrificing the small mass customer experience, on the contrary, often DaTiLiang clients that are more likely to have some influence on the stability of emergencies, So you can think of it as a win-win-win operation.

In addition, SaaS tends to give customers faster feature delivery, but this rapid delivery is likely to lead to poor use experience, such as the presence of serious bugs.

At this point, if we have a multi-instance architecture, we can easily implement grayscale publishing, making the feature delivery process more robust and protecting our brand image.

Seven summarizes

In practice, it is easy to neglect early systematic planning and design at the basic level, such as multi-tenancy, which leads to the continuous increase of r&d, maintenance costs and even the inability to respond flexibly to new business opportunities.

Good architecture can make these essential characteristics transparent, so that the business layer is not sensitive, so as to improve the efficiency of research and development. In enterprise SaaS multi-tenancy architecture design link, we will be able to list or foresee all possible, under the different technology selection of multi-tenant implementation also have very big difference, we should emphasize to explore the essence of technology, the computing and storage resources of the isolation level, system planning and architecture, completes the basic components of construction and precipitation.

Only put aside the phenomenon to summarize the relevant essential methods, in order to change with constant.

About the author

Zhang. As the intelligent enterprise architect of netease, I am responsible for the architecture, infrastructure construction and other related work of several SaaS products under my company. I have rich experience in THE RESEARCH and development of C-end and B-end products. At present, it mainly focuses on technical architecture, R&D management and other aspects of enterprise-level products.

More technical dry goods, welcome to pay attention to “netease Intelligent Enterprise Technology +”. Listen to netease CTO talk about frontier observation, see the most valuable technology dry goods, learn the latest practical experience of netease. Netease Intelligent Enterprise Technology + will accompany you to grow from a thinker to a technical expert.

Rambling on multi-tenant design for enterprise SaaS

A core focus of multi-tenant technology

Implementation of multi-tenancy

Iii. Resources

4 Isolation of storage resources

Isolation of computing resources

Extension of single instance architecture

Seven summarizes

About the author

Related Posts

Unified processing of SpringBoot global exceptions

Spring Cloud Starter Series – Pre-preparation

Chapter 20: Master the Horizon and do whatever you want (Part 2)