This is the 8th day of my participation in the August More text Challenge. For details, see: August More Text Challenge

Whether microservices are suitable for small teams is a matter of opinion.

With the increase of business complexity, the single application is getting bigger and bigger, just like the code lines of a class are getting more and more. Divide and conquer and cut into multiple classes should be a better solution. Therefore, it is more consistent with the idea of divide and conquer that a single application is divided into several small applications.

Of course, microservices architecture should not be something that a small team should consider from the start, but rather something that evolves slowly, and it’s important to be careful about overdesigning.

The company’s background is to provide SaaS services, but also custom development and private deployment for large customers. In less than two years, the technology architecture has gone from monolithic to microservice to containerization.

Single application Era

There were only two people in early development, so it was redundant to think about microservices and the like. However, due to the influence of the former company, the original decision before and after the end of the separation of the route, because do not need to consider the problem of SEO, simply made SPA single-page application.

In addition, the separation of the front and back ends does not necessarily prevent server-side rendering, such as e-commerce systems or some anonymously accessible systems, adding a thin View layer, whether it is PHP or using Thymeleaf is a good choice.

On the deployment architecture, we use Nginx proxy front-end HTML resources to reverse proxy to server port 8080 according to the path when receiving requests.

The interface definition

Interfaces are defined in standard Restful terms,

  • For example, / API /v2

  • Centered on resources, the use of complex expressions, for example, / API/contacts, can also be nested, such as/API/groups / 1 / contacts / 100

  • Url as far as possible do not use verbs, practice found to do this is really difficult, each r & D staff is not consistent, the name is also strange, all need to be covered in the code Review.

  • PUT and PATCH are both updates, but PUT is a full update and PATCH is a partial update. The former is also updated to the database if an empty field is passed in. At present, we use PUT but ignore empty and unpassed fields, which is essentially a partial update, and this brings some problems, such as the fact that there are empty businesses that need special treatment.

  • The interface generates documentation via Swagger for use by front end colleagues.

Continuous Integration (CI)

The initial members of the team have previous experience working in large teams, so there are some common requirements for quality control and process management. Therefore, an integrated testing system was introduced at the beginning of development. Test cases for interfaces could be developed directly, executed uniformly and coverage calculated.

Generally speaking, the automatic execution of code is Unit Test. We call it integration Test because the Test cases are API specific, and include database reading and writing, MQ operation, and so on. In addition to the dependence of external services, they are basically consistent with real production scenarios. It is equivalent to doing the Jmeter thing directly at the Java level.

This was a great convenience for us in the early stages of development. It is worth noting, however, that with the introduction of databases and other resources, there are many more issues to consider in data preparation and data cleansing, such as how to control test data from one parallel task to another.

To automate the process, Jenkins was a no-brainer.

The developer submits the code into Gerrit, and Jenkins is triggered to compile the code and perform integration tests. When the tests are completed, a test report is generated, and the code review is performed by Reviewer after the tests pass. Such A CI architecture is good enough for the single application era, and with the coverage of integration tests, it becomes more confident to refactor code while maintaining API compatibility.

Age of Microservices

Service Split Principle

At the data level, the easiest way to do this is to see if there are fewer associations between the tables in the database. For example, the easiest to separate is generally the user management module. In terms of domain-driven design (DDD), a service is actually one or more associated domain models that define service boundaries with minimal data redundancy.

Collaboration of multiple domain objects within a single service is accomplished through domain services. Of course DDD is complex and requires the domain object to be designed with a hyperemia model rather than an anemia model.

From a practical point of view, the congestion model is very difficult for most developers, and what should be behavior and what should be domain services is often a test of staff.

Service separation is a big project, which often requires several people who are most familiar with the business and data to discuss together, even considering the team structure, and the final effect is that the service boundary is clear, without circular dependencies and avoid bidirectional dependencies.

Framework to choose

Since the previous individual service used Spring Boot, the framework naturally chose Spring Cloud. In fact, I personally believe that the microservice framework should not limit the technology and language, but in production practice, I found that both Dubbo and Spring Cloud are invasive, and we found many problems when integrating NodeJS applications into the Spring Cloud system. Perhaps the future service Mesh is the more logical path.

This is a typical way to use Spring Cloud

  • Zuul acts as a gateway and distributes requests from different clients to specific services

  • Erueka acts as the registry, completing service discovery and service registration

  • Each service, including the Gateway, comes with the current limiting and circuit breaker capabilities provided by Hystrix

  • Services call each other through feign and the ribbon. Feign actually shields the service from erueka

As mentioned above, service registration, service discovery, service invocation, circuit breaker, and traffic limiting are all handled on their own.

A few more words about Zuul. The Zuul provided by Sprin Cloud is tailored to the Netflix version without dynamic routing (the Groovy implementation). Another point is that Zuul’s performance is mediocre because of its synchronous programming model. It is easy to fill the thread pool of servlets for io-intensive links with long background processing time. Therefore, if zuul and the main service are placed on the same physical machine, zuul will consume a lot of resources in the case of heavy traffic.

The actual test also found that the performance penalty between zuul and direct service invocation was around 30%, especially when the concurrent pressure was high. Now that Spring Cloud Gateway is a major push for Pivotal, it supports an asynchronous programming model, and future architectural optimizations may adopt it or directly use a NGinX-based gateway like Kong to provide performance. Of course, the synchronization model has the advantage of being simpler to code, and I’ll see how to establish link tracing using ThreadLocal later.

Architecture transformation

After half a year’s transformation and the addition of new requirements, the individual services were continuously broken down, eventually forming more than 10 microservices, and Spark was built for BI. Two systems have been initially formed: online Service System (OLTP) based on micro-service architecture and Spark Big Data Analysis System (OLAP). Data source increased from Mysql only to ES and Hive. Data synchronization between multiple data sources is also a worthy topic, but it’s too much to cover in this article.

Automatic deployment

The Continuous Delivery (CD) implementation is more complex than CI, and we haven’t implemented CD yet, only to automate deployment in a resource-constrained situation.

Since the production environment needs to be operated by the leapfrog, we generate JAR packages through Jenkins and transmit them to the leapfrog, and then deploy them to the cluster through Ansible.

A bare-handed deployment is fine for small teams, as long as testing (manual testing + automated testing) is in place before deployment.

The link to track

There are a lot of open source full link tracking, such as Spring Cloud Sleuth + Zipkin, domestic Meituan CAT and so on. The purpose is to obtain the behavior log of the whole request link through a fixed value when a request passes through multiple services. Based on this, time analysis can be performed and some performance diagnosis functions can be derived. But for us, the primary purpose is trouble shooting. If something goes wrong, we need to quickly locate what service the exception is on and what the whole link of the request is.

To make the solution lightweight, we print RequestId and TraceId in the log to mark the link. RequestId generates a unique request in the gateway. TraceId is equivalent to a secondary path. It is the same as RequestId at the beginning, but after entering a thread pool or message queue, TraceId adds a tag to identify the unique path.

For example, when a single request sends a message to MQ that may be consumed by multiple consumers, each consuming thread generates its own TraceId to mark the consuming link. TraceId is added to avoid filtering out too many logs using only RequestId. The implementation is shown in the figure,

Simply put, the APIRequestContext is threaded through a ThreadLocal to concatenate all calls within a single service. When calls are made across services, the APIRequestContext information is converted into Http headers. After receiving the Http Header, the called party constructs the APIRequestContext and places it in the ThreadLocal, repeating the loop to ensure that the RequestId and TraceId are not lost. If you enter MQ, the APIRequestContext information is converted to a Message Header (implemented based on Rabbitmq).

After the logs are collected to the log system, if a fault occurs, you only need to capture the abnormal RequestId or TraceId to locate the fault

Operational monitoring

Telegraf + influxDB + Grafana scheme was adopted before containerization. Telegraf was used as a probe to collect information about resources such as JVM, System, mysql, etc., and then written influxDB. Finally, data was visualized through Grafana. Spring Boot actuator can work with Jolokia to expose the JVM endpoint. The whole scheme is zero coded and only takes time to configure.

Container age

Architecture transformation

Since containerization was planned at the beginning of microservices, the architecture is not changed much, except that each service will create a Dockerfile to create a Docker image

The parts that involve change include:

  1. CI has added steps to build docker Image

  2. The database upgrade is stripped from the application and made into a separate Docker image during automated testing

  3. Instead of eruka, k8S service was used in production

The reasons follow.

The integration of Spring Cloud and K8S

We’re using Redhat’s Openshift, which can be thought of as k8s enterprise edition, with its own concept of service. If there are multiple Pods under a service, the pod is a serviceable unit. K8s provides default load balancing control when services call each other. The caller only needs to write the serviceId of the called party. This is similar to what Spring Cloud Fegin provides with the Ribbon.

That said, service governance can be addressed with K8S, so why replace it? In fact, as mentioned above, many BFF (Backend for Frontend) are implemented using NodeJS to support heterogeneous languages in the Spring Cloud technology stack. To integrate these services into the Spring Cloud, service registration, load balancing, Heartbeat check and so on.

These wheels will have to be rebuilt if services in other languages are added in the future. For all these reasons, the decision was made to replace Eruka with the networking capabilities provided by Openshift.

Since the local development and commissioning process still relies on Eruka, it is only controlled by configuration parameters on production.

'ribbon. Eureka. enabled' is set to false, so the ribbon does not get the list of services from Eureka. 'foo.ribbon. Listofservers' is set to' http://foo:8080 'so that when a service needs to use service foo, it calls' http://foo:8080' directlyCopy the code

Reconstruction of CI

The transformation of CI is mainly a process of compiling docker image and packaging it into Harbor. The image will be directly pulled from Harbor during deployment. The other is a database upgrade tool. Previously we used Flyway as a database upgrade tool that automatically executed SQL scripts when the application was started.

As the number of service instances increases, it is not uncommon for multiple instances of a service to be upgraded at the same time. Although Flyway uses database locks to ensure that the upgrade process is not concurrent, this can lead to longer startup times for locked services.

In terms of the actual upgrade process, it is more likely to make sense to shift the possible concurrent upgrades to a single process. In addition, the architecture of the database and table will make it difficult to upgrade the database automatically when the application starts. Taking all this into consideration, we split the upgrade task so that each service has its own upgrade and will be containerized.

When used, it is used as a run once tool, that is, the way of Docker run-rm. In addition, it also supports the function of setting the target version, which plays a very good role in the cross-version upgrade of the privatization project.

As for automatic deployment, due to the upstream and downstream relationship between services, such as Config and Eureka, which are basic services that are dependent on other services, the deployment sequence is also generated. Pipeline based on Jenkins can solve this problem very well.

summary

Each of the above points could be written in depth as an article. The evolution of the architecture of microservices involves development, testing, and operations, and requires close collaboration across multiple teams.

Divide and conquer is the only way to solve large systems in the software industry. As a small team, we do not blindly catch up with new ones, but solve problems in the process of development through service-oriented methods.

On the other hand, we also realized that the requirements of microservices for people, as well as the challenges for the team are higher and greater than in the past. The future still needs to be explored and evolution is still on the way.