Optimize your data access by using the CQRS architecture pattern – Part 1

A theoretical and practical approach

Has it ever occurred to you, how many times have you started with a simple CRUD architecture to develop a new service, environment of a domain object from your boundaries, at that moment is “OK”, while the surrounding ecosystem growing and extended over time, until you begin to notice this, either (or both).

  1. The need to perform queries around your object becomes complex and difficult to handle (multiple HTTP calls to other services, expensive cross-table joins, and so on).
  2. Because your service ends up having more reads than writes, your write performance degrades.

Of course, some of the problems pointed out above can be solved in many ways. From my head, I can think of some data patterns such as sharding strategy, materialized views on DB, data report offloading (data warehouse), reading and updating of different database access techniques, and probably others that are definitely solved by using a combination of different patterns and methods.

However, I will focus my writing on one of my favorite patterns, the CQRS pattern.

What problem does it solve?

When using a traditional architecture like CRUD, the same data model used to update and query databases for large-scale solutions can end up becoming a burden. For example.

  • A reading endpoint may perform multiple queries on the query side against different sources to return complex DTOs maps with different shapes. And as we’ve seen, mapping can get pretty complicated.
  • On the write side, the model may implement multiple complex business rules to validate creation and update operations.
  • We might want to query the model in other ways, perhaps by folding multiple records into one, or by summarizing more information on the model that is not currently available in its domain, or simply by changing the way the query looks at the records by using some secondary fields as keys.

As a result, we started doing too much with the CRUD service around the model objects, and it got bad as it grew. At this point, this pattern appears in our toolbelt to help us solve these extensibility issues.

This mode!

CQRS stands for _ Command and query Responsibility Isolation _. Its main purpose is based on the simple idea of separating data mutating operations (commands) from reading operations (queries). To achieve this, it separates read and write into separate models, creates/updates with commands, and reads data from them with queries.

Conceptual architecture of general CQRS

As shown in the figure above, you’ll notice that there is an event queue that connects the write and read worlds by pushing an event to the topic every time an instance of our domain is created/updated on the write side. The query service then reads from the incoming event, normalizing, enriching, slicing, and chunking the data to create a query-optimized model and store it for later reading.

In particular, I have focused this series of articles on leveraging the CQRS pattern by adding the event source architecture. This works well when we want to keep the process with a clear separation of concerns, asynchronous, and with the appropriate database engine for query performance (for example, one SQL database for writes and one NoSQL for query operations, optimizing queries through materialized views to avoid expensive joins).

In addition, when we use the event source architecture, the event theme becomes our gold source of data, because it can be used at any time to replay the entire collection of events and reproduce the current state of the data. This makes it possible to read the queue asynchronously from the start and generate a new set of materialized views from the raw data as the system evolves or when the read model must change. Materialized views are actually persistent read-only caches of data.

However, another benefit of separating worlds is the opportunity to extend both separately, thus reducing lock contention. In addition, since most of the complex business logic is in the write model, separating the models makes them more flexible and easier to maintain. Reading the model is relatively simple.

When is this pattern a convenient solution?

Like any pattern, CQRS can be useful sometimes, but not always. We have learned on many occasions that no silver bullet will kill all our problems. However, under these possible requirements, it can be useful.

  • The performance of data reads must be fine-tuned separately from the performance of data writes, especially when the number of reads is much greater than the number of writes. In this case, you can scale up the read model, but only run the write model on a few instances.
  • In this case, one development team can focus on the complex domain model as part of the write model, while the other team can focus on the read model and user interface.
  • Systems are expected to evolve and may contain multiple versions of models, or places where business rules change frequently.
  • With the integration of other systems, especially with event sources, the temporal failure of one subsystem should not affect the availability of the other subsystems.
  • Allows reading of ultimately consistent data. Due to the asynchronous nature of this pattern.

Abstract

As I said above, this is not the solution to all of our extension/query problems. We should be very careful when using the CQRS pattern. It can be useful sometimes, but not always. Many systems do fit and work well in CRUD mode alone because they are simple enough that if we wanted to switch to CQRS mode, we would waste time turning things that are easy to manage into very complex architectures to implement and maintain. So keep it simple unless you have to. Keep in mind that reading data from the queue and doing all the data transformations to de-normalize the original input can be very expensive in terms of time consumption and resource usage.

However, despite the complexity, I can easily find this pattern useful when faced with issues such as the need to have multiple ways to query our model without affecting write performance. Not only that, but it’s worth emphasizing that we can replay the entire queue directly to create a new view of the original data, making the query simpler and cheaper because we can avoid table joins and data merges.

Next steps

Of course, there are many other aspects of this pattern that could be discussed in a longer article, but I want to at least highlight the core principles that govern this architecture. So, let’s jump right to the second part of this series and look at this pattern in action and how the moving parts interact by coding some silly examples using GoLang, Kafka, MongoDB, and Cassandra. See you next time!

[Update] Part 2 is ready for you!

Optimize your data access by using CQRS architectural patterns — a combination of theory and practice.


Optimizing your Data access by Using the CQRS Architectural Pattern — a theoretical and practical approach originally published on ITNEXT’s Medium, people continue the conversation by highlighting and responding to this story.