In recent years, it seems that everyone has become obsessed with microservices, while individual architectures have gradually faded from view.

Of course, hot trends come and go, and the attention they receive is often exaggerated by the media, which is not always the case. For microservices, though, there seems to be a consensus that the trend is here to stay. It makes sense. Conceptually, microservices extend the same principles that engineers have used for decades.

Once you start using microservices architectures, you may need the five rules mentioned in this article to help you run them successfully.

The flip side of microservices

Separation of concerns (SoC), a design principle that dictates that different parts of software should be built according to “concerns” or overall functionality, has been used for more than 30 years to determine how technologies are built. In a single application, it is embodied in the separation of presentation layer, business layer and data layer in a typical 3-tier architecture.

Microservices took this concept and turned it on its head. They isolate the same application in such a way that a single code base of the application can be decomposed and deployed separately. The benefits are huge, but they come at a cost, usually in terms of time and money for operation and maintenance. In addition to the significant upfront investment of transitioning an existing application to the container, maintaining the application presents new challenges.

Challenge 1: It seems difficult to monitor the whole

While singleton applications have their own challenges, the process of rolling back a “bad” version in a singleton is fairly straightforward. In containerized applications, things get a lot more complicated. Whether you’re breaking down individual applications into microservices or building a new system from scratch, you now have more services to monitor. Each of these may:

  • Use different technologies and/or languages.

  • Run on different machines and/or containers

  • Containerization and orchestration using K8s or similar technology

As a result, the system becomes highly decentralized and requires more centralized monitoring. Unfortunately, that also means a lot more to monitor. Where once there was only one single process, now there may be dozens of containerized processes running in different regions and sometimes even different clouds. This means that instead of having a single operational metric governing them, IT/ operations teams can use IT to assess the general uptime of applications. Instead, teams now have to deal with hundreds (or even thousands) of indicator, event, and alarm types, from which they need to separate the active signals from the inactive noise.

The solution

DevOps monitoring requires moving from a flat data model to a layered model where a set of high-level system and business KPIs can be observed at any time. With just a small deviation, the team must be able to access the metric hierarchy to see which microservice the interference is coming from, and from there to learn which container actually failed. This will most likely require a realignment of the DevOps tool chain from a data storage and visualization perspective. Open source timing DB tools such as Prometheus and Grafana 7.0 make this goal easy to achieve.

Challenge 2: Cross-service logging

One of the first things to mention when talking about monitoring applications is: logs, logs, logs. The daily IT logs generated by servers are equivalent to carbon emissions, resulting in overflowing hard drives and crazy ingestion, storage, and tooling costs. Even with a monolithic architecture, your logs may already be causing some headaches for your engineers.

With microservices, logging becomes more decentralized. A simple user business can now be conducted through a number of services, all of which have their own logging framework. To solve the problem, you must extract all the different logs from all the services that the business might pass through to understand the problem.

The solution

The main challenge here is to understand how individual businesses “flow” between different services. To achieve this, you need to make a lot of changes to how traditional singleton programs typically log all events during sequential business execution. While a number of frameworks have emerged to help developers with this process (we particularly like Jaeger’s approach), moving to asynchronous, track-driven logging remains a daunting effort for enterprises that want to refactor singleton into microservices.

Challenge 3: Deploying one service breaks another

A key assumption in the single-chip world is that all code is deployed at the same time, which means that applications are at their most vulnerable during a known and relatively short period of time (the first 24-48 hours after deployment). In the world of microservices, this assumption is no longer true: because microservices are inherently intertwined, subtle changes to one service can cause behavior or performance problems that manifest in the other. The challenge, therefore, is that the microservice that is currently failing is such that another development team is not expecting an interruption in their code. This can lead to both unexpected instability throughout the application and friction within some organizations. While the microservices architecture may make the process of deploying code easier, it actually makes the process of verifying code behavior after deployment harder.

The solution

The enterprise must create a shared release calendar and allocate resources to closely test and monitor the behavior of the entire application whenever related microservices are deployed. Deploying new versions of microservices without cross-team coordination, like avocado on toast, is a recipe for success in addressing this challenge.

Challenge 4: Difficulty in finding the root cause of the problem

At this point, you have locked down the service in question and extracted all the data that needs to be extracted, including the stack trace and some variable values in the log. You may also have some APM solutions like New Relic, AppDynamics, or Dynatrace. From there, you’ll get some additional data about abnormally high processing times for some related methods. But… The root cause of the problem?

The first few variables you (hopefully) get from the log probably won’t be the ones that move the needle. They are often more like breadcrumbs pointing to the next clue than further cause. At this point, we need to find as much magic as we can under the app. Traditionally, this requires issuing detailed information about the status of each failed transaction (that is, why it failed at all). The challenge here is that it takes tremendous foresight on the part of developers to know what information they need to troubleshoot problems ahead of time.

The solution

When error sources in microservices span multiple services, it is critical to have a centralized problem root detection approach. The team must consider what particles of information are needed to diagnose future problems, and at what level they should log to account for performance and security considerations — that’s a high mountain, and an endless one at that.

Challenge 5: Version management

What we think is worth highlighting is the transition from a layer model in a typical monolithic architecture to a graph model for microservices. Since more than 80% of application code is typically third-party code, managing the way third-party code is shared between the company’s different microservices becomes a key factor in avoiding unprecedented dependency hell.

Consider a situation where some teams are using X.Y versions of third-party components or shared instance applications (almost all companies have them), while others are using X.Z. This increases the risk of critical software problems due to lack of compatibility between releases, or slight changes in behavior between releases, which can result in the need to troubleshoot the most specific and painful software bugs.

But before we do that, let’s remind ourselves that any microservice that uses an older, more vulnerable version of third-party code creates security issues — a hacker’s paradise. Allowing different teams to manage their dependencies in an island-like REPO may be possible in a monolithic world, but it is definitely not possible in a microservice architecture.

The solution

Companies must invest in redesigning their build processes to leverage centralized Artifact repositories (Artifactory will be one of them) for third party and shared utility code. Teams should only be allowed to store their own code in a separate repository.

Final thoughts

Like most advances in the tech industry, microservices have taken a familiar concept and turned it upside down. They are rethinking the way large-scale applications are designed, built, and maintained. They bring many benefits, but also new challenges. When we look at these five major challenges together, we can see that they all stem from the same idea. So the bottom line is that whenever a new technology like microservices is adopted, it requires both a rethink and a realignment of how code is built, deployed, and viewed. The advantages of microservices are hard to refuse — but the risks are huge.