Introduction to Software Architecture

What is software architecture

Let’s start with wikipedia’s definition of software architecture. Software architecture is an abstract description of the overall structure and components of software that guides the design of various aspects of large-scale software systems. A short but not very clear definition, indeed very abstract, at first glance I believe that most people will have such a feeling. I have three or five years of work experience, architecture design is not urgent for myself, I learn more XX technology, learn almost, then start to learn architecture design. This kind of thinking mode is not right, said to big, taobao’s architecture design, architecture need to say a small for graduates of graduation design also needs to architectural design, the core of the architecture is not how much they will be big on technology, architecture used so much technology, the software architecture of the ultimate goal is to use the smallest human cost to meet the needs of building and maintaining the system. Therefore, there are many practical opportunities to practice your architectural design skills during the normal development process, even in a daily code review.

The value of architectural design

Software systems can be measured on both behavioral and architectural dimensions. Under normal circumstances, programmers should ensure that the actual value of their system in both dimensions can be maintained in a good state for a long time. But in reality, as the project evolves, the original architectural design often compromises various “behavioral” operations, resulting in software that is no longer “soft” and the value of the system eventually approaches zero.

Behavior value is the most intuitive value dimension of a software system. The daily work of programmers is to write requirements documents according to the requirements provided by customers, and then convert the requirements into working systems according to the requirements documents, so as to create value for the users of the system. When the system encounters a problem, then carries on the debugging, solves the problem. That’s all most programmers do, document requirements, write code, fix bugs… What’s wrong with that for an average programmer? A software with a perfect architecture design at the beginning will gradually make all kinds of compromises for its functions. From a “soft” part, it gradually becomes hard, resulting in the final change of the new become unbearable, can only work overtime. The reason for this is that we just heap functions on the system according to the requirements document, patch the bugs and make the program run normally, maximizing the immediate benefits, but not knowing that we are eroding the original architecture step by step.

The second value dimension of a software system is found in the word software: software. “Ware” means “product” and “soft”, which is self-evident, refers to the flexibility of software. The purpose of software is to allow us to change the behavior of machines in a flexible way. For the hard to change working behavior of the machine, such as memory, CPU, we usually call it hardware. In order to ensure the value of the architecture, we need to continuously ensure that the software system is always in a “soft” state, that is, the software should be open to the outside world and easy to modify, and the corresponding software must be simple and convenient to implement when the requirements are increased or changed. You should not cause code changes to cost out of proportion to their implementation. For example, it took 3 days for a system to support wechat Payment, but 5 days for the subsequent addition of Alipay payment, and a large number of wechat payment codes need to be changed, which is unacceptable. Because for customers, they put forward a series of changes to the demand of category are similar, so their cost also should be the same, but for developers, system continuously slowly changes lead to the neglect of the design of the software architecture, which can lead to system becomes more and more difficult to maintain, the system will be unable to modify.

To be a qualified developer, the most basic and more important thing to think about the behavioral value of architectural design is to be highly alert to the architectural value in the process of behavior and to think about how to keep software “soft”.

From clean code to modular design

Write clean code

Uncle Bob’s series is a guide to how a novice programmer can slowly work his way up the ladder. Among them, the Code Clean Way teaches us how to write easy-to-read, extensible, maintainable, reusable code, and lay down the basic skills. The Code Clean Way: A Professional Programmer teaches us how to become an educated programmer. From the micro (code level) to the macro (architecture level), clean Architecture Introduces three programming paradigms of software development, design principles, and an architecture model called clean architecture. So the first step, how do you write clean code? The Code Clean Way cites one of the five principles of software development:

  • Collation: Naming conventions
  • Tidy up: Put your code where it should be
  • Clear: Clean code
  • Clean: code style, practices
  • Body beauty: Continuous improvement

Quantified, there is no fixed standard when writing code, but there is a general specification:

  • Readability is always Paramount (think of being forced to look at someone else’s code)
  • Meaningful names (variable names, method names, function names, class names, package names)
  • Do only one thing (reduce coupling, improve readability)
  • Reduce dependencies (improve program robustness)
  • Avoid unnecessary duplicate code
  • Avoid unnecessary comments and try to explain the code snippets themselves
  • Pass all the tests

The focus of this chat will be on design principles and clean architecture, but you won’t need to go into the specifics of clean code. For more information, see The Code Clean Way and Refactoring.

Reintroduce the SOLID principle

Writing clean code is as critical to building a good software system as building a good turn of the head. With a strong turning head, it is possible to continue to build solid walls and various suites, and eventually high-rise buildings. The S.O.L.I.D principle tells us how to compose classes from groups of data and functions, and how to join these classes into programs (classes represent groups of data and functions). S.O.L.I.D applies to middle-level components and modules in an architecture. Its purpose is to:

  • Build software that is easy to extend, accept change, and make it easy to change
  • Abstract component design, so that the software can better express itself, easy to understand
  • Build reusable components
Single Responsibility PrincipleSingle responsibility principle

The SRP principle is probably one of the most commonly mentioned principles in daily development, and most people understand it to mean that each class should only do one thing, which is a vague statement. A more accurate definition would be:

Any software module should only be responsible for a certain class of actors

Among them:

  • A software module is a source code file (a set of closely related functions and data structures)
  • An actor is a user who has only one or more common needs

As an example from the book, there is an Employee class in a payroll management system

The three functions in this class clearly violate the SRP principle, and the three functions should not be inEmployeeThe operations of calculating payroll, calculating hours, and saving should be done in finance, manpower, and DBA, respectively. So what do you do about separation of responsibilities in this situation? The simplest and most crude way is to just separate the data from the function

It is also possible to use a Facade pattern that hides the implementation logic and exposes only simple apis

Open Closed PrincipleThe open closed principle

The OCP principle can best reflect the ability of a system to expand. When a new requirement is added, can it be minimized or even do not need to make any changes to the existing code? A good software design should be easy to expand and resist modification. In order to achieve this, the principle of dependency inversion must be observed, the changeable side should depend on the stable side, and the stable side should not depend on the changeable implementation. This allows you to add functionality while minimizing internal changes (on the more stable end).

Liskov Substiution PrincipleRichter’s substitution principle

LSP refers to a kind of substitutability: if for every object o1 of type S there exists an object O2 of type T, so that the program P operating on type T behaves the same as o2 replacing O1, we can call S a subtype of T. Its main role is to guide the principles of inheritance and interfaces and their implementation.

Interface Segregation PrincipleInterface Isolation Principle

Clients should not be forced to rely on interfaces they do not use. If an interface contains too many methods and different subclasses need to implement only part of the method, they should split it up by separating interfaces, with different subclasses implementing different interfaces and corresponding functions. In general, software design at any level can be harmful if it relies on what is not needed.

Dependency Inversion PrincipleDependency inversion principle

The DIP principle is a very important coding thought. Dips play an important role in everything from component design to many architectural designs. DIP refers to a program that relies on abstract interfaces, not concrete implementations. Both high level modules and low level modules should rely on abstractions. The advantage of interface oriented programming is that the design can ignore the implementation and define the abstract logic, so that the design components are more stable (in object-oriented languages, interfaces are less likely to change and more stable than concrete classes). Therefore, in order to pursue architectural stability, it is necessary to use more stable interfaces and less dependent on changing concrete implementations. DIP principles:

  • Use more abstract interfaces in your code and avoid the use of variable concrete implementation classes. The creation of objects is typically created using the factory pattern
  • Do not create derived classes on concrete implementation classes
  • Do not override a function that contains an implementation. (Calling a function that contains an implementation usually means introducing source-level dependencies.) When override is required, an abstract function is typically created, and subclasses override the abstract function
  • Avoid writing implementation-specific names or other names that are easily changeable in your code

The middle curve represents the boundary between the abstraction layer and the implementation layer in the software architecture. All source-level dependencies across this boundary should be one-way, that is, the concrete implementation layer depends on the abstraction layer.

Introduction to clean architecture

Hexagon structure (Hexagonal Architecture), the DCI (Data, Context, Interactive) Architecture, BCE (a Boundary, the Controller, the Entity) Architecture

Hexagonal structure

Hexagonal architecture was proposed by Alistair Cockburn in 2005. In order to solve the problems caused by traditional hierarchical architecture, hexagonal architecture is also a kind of hierarchical architecture, but it is layered from the inside out, instead of the top and bottom. Some problems existing in the traditional three-tier architecture (presentation layer, business logic layer, data access layer) :

  • The core logic of the program may be scattered in different layers, making it very difficult to replace one layer later
  • Testing the core logic is very difficult
  • Core logic will depend on the specific dependencies of the third party, resulting in strong binding between business and specific technologies and difficult to change

The hexagon architecture is also called port-to-adapter. The hexagon architecture divides the system into internal and external. The internal represents the core business logic of the application, and the external represents the driver logic, infrastructure, or other applications of the application. Several characteristics of hexagonal architecture:

  • Separation of concerns: The value of the software is separated from its business value, so the focus is on the business logic, separating the business logic from external drivers (specific technologies), business and technology are independent, which makes the business logic more stable and easier to test
  • External replaceable: A port for multiple adapters represents an abstraction from the external. The inside doesn’t care how the outside uses the port and assumes that the outside user is replaceable from the start. For example, for data persistence, for business logic, you don’t need to know what you’re using, you just need to provide ports, and then there’s an adapter to implement that, right
  • Dependency inversion: Dependency inversion is the basis of hexagon structure, in order to guarantee the stability of the internal business logic must do not let the internal and external components, external dependencies only with internal, external is can be replaced, so do this will need to use the dependency inversion, by driver adapter will be internal driver adapter into the application, the definition of port within the application, The technical implementation is done by the adapter
DCI architecture

DCI is short for Data, Context and Interaction of objects. DCI focuses on Interaction behaviors in different scenarios and is a paradigm design of state and behavior in object oriented. Although the traditional MVC architecture is simple and clear in structure, it disintegrates the business logic to prevent the confusion and tracking the code with the complexity of the project. The most prominent feature of DCI is use-case-driven design, which is more in line with the user’s mental model, code is requirements, and use-case-based code is easier for programmers to understand later.

Clean architecture parsing

Clean architecture is a combination of hexagonal architecture, DCI architecture and BCE architecture. Each of these architectures has a common design goal: to segment the software according to different concerns, with at least one layer containing the core logic of the software. Features:

  • Framework-independent: The architecture of the entire system does not depend on the specific architecture, which can be used as a tool, but does not make the system adapt to the architecture
  • Testable: The business logic of the system can be tested independently of UI, database, Web and other external elements
  • Independent of the UI
  • Database independent
  • Independence from any other third party dependencies

Depend on the rules

The concentric circles in the figure represent different layers in the software system, and the closer they are to the center, the higher the software layer they are located in. The outer circle represents the mechanism, and the inner circle represents the strategy. Dependencies in the source code must point to the inner layer of the concentric circle, that is, the underlying mechanism points to the high-level policy. The code in the inner circle should not refer to the code in the outer circle. In particular, the code in the inner circle should not refer to variables, methods, etc.

A few concepts of clean architecture

Business entity The business entity layer encapsulates the key business logic of the entire system. A business entity can be an object with methods or a collection of data structures and functions. However, as long as it can be reused by other different applications in the system.

Use cases The use case layer of software usually contains the business logic in specific application scenarios, which encapsulates and realizes all use cases of the whole system. These use cases guide the flow of data in/out of the business entities and direct the business entities to leverage the key business logic within them to achieve the design goals of the use cases. The use case layer is an outer layer relative to the business entity, so changes to this layer should not affect the inner layer, so the business entity layer should not depend on the use case layer. Also, the use case layer is an inner layer relative to the external framework/infrastructure, so the use case layer should not be directly dependent on them.

In the interface adapter layer of software is usually a set of data converters that convert data from the format that is most convenient for use cases and industry entities to the format that is most convenient for external systems, such as databases and the Web. For example, this layer should contain the entire GUI MVC framework. The display, view, and controller are all adapters. Business processing needs to be passed from the controller to the use case layer, which acts as the entry point of business logic processing, and finally returns the controller.

This layer will be responsible for the data format from use cases and the most convenient format to physical layer persistence framework used by the most convenient format, that is to say in this layer of business entity data format conversion for persistent database need PO format, this conversion is at the interface adapter layer, so the future once in the persistence framework, Then the inner logic will not be affected.

This layer is also responsible for converting data formats from outside the system to the format required by the business entities inside the system. For example, in A microservice architecture system, calling service B from service A is done in the interface adapter layer, and the data format returned is converted to the format required by the current service business entity.

Across the border

Each concentric circle in the figure represents a boundary. Source-level dependencies must be outer dependencies and inner dependencies. The higher the level, the higher the level of abstraction and policy, the innermost circle contains the most general, the highest policy, and the outermost circle contains the most concrete implementation details. When the inner layer must call the outer layer, for example, the use case layer needs to call the gateway adapter to persist the data, which is not allowed by the principle of clean architecture, the dependency inversion principle should be adopted to solve the contradiction. Defining policies that require persistence at the use-case layer, typically an interface, is implemented in a gateway adapter that invokes a specific persistence framework for functional implementation, thus avoiding the use-case layer’s direct dependence on the adapter layer.

Plug-in thinking

When we want to design a system with better scalability and maintainability, we must have plug-in thinking in the design, abstract the user’s scenes, and converge these scenes to encapsulate abstract high-level strategies to expose users. Convention/injection is typically used. For example, the Spring framework provides users with a variety of pre/post processor policies that allow users to implement custom processors and then inject them into Spring, which processes user-injected events according to the corresponding policies. For software system, then, use cases, business entities layer encapsulation is the core logic and high-level strategy, these strategies may include the strategies for data persistence, dependence on third party call strategy, or strategies are provided, exposed to the caller, the caller code written correspondence in accordance with the policy function, you can use the built-in function that we offer, SonarQube, for example, allows developers to upload their own code quality testing implementations.

Delayed decision thinking

After the high-level strategy has been designed in the initial architectural design, how should the technical implementation of the details be selected? Such as data persistence, data presentation and so on. Deferred decisions help us. In an Agile development team, iterative thinking is used to deliver projects. According to Lean thinking, we should reduce waste and maximize value delivered in each iteration. So what is waste? Anything unrelated to the goals of the current iteration is a waste, and if the team does something that only adds value in the next iteration, it’s also a waste. So that’s what deferred decision is about, if something doesn’t have to be decided, it has to be done, then it’s put on hold until it has to be done, because when it has to be done, the understanding of it is maximized. The premise to do this is to design high-level strategies, such as data persistence. In the beginning, the business may only need to store simple data. Is a database really needed at this time? Wouldn’t a simple document store add more value at this stage?

Combat: MVC architecture reconstructs clean architecture

Project Background

After all the theoretical stuff, I’m going to show you a project in action, the code for one of the most common MVC architectures, and how we can rebuild it into a clean architecture. This system is a relatively simple micro-service system, although the sparrow is small, but all five organs. (In order to quickly demonstrate the whole refactoring process, there is no test-related code in the code, in general, there must be a test before the refactoring process.)

The whole system consists of two roles, the student and the teacher

  • Student: It has two functions. 1. Can publish notes; 2. Delete notes (if the note is a good one, the status record will also be deleted)
  • Teacher: It has two functions. 1. Mark students’ notes as excellent notes. If the notes are marked successfully, the notification service will be requested and the notification will be sent; 2. Remove the student’s notes from excellent status, and then request the notification service to send a successful cancellation notification

MVC version code and legacy issues

The MVC version of the code is available on Github. There are still a lot of bad taste and issues left in the current version, which we will be refactoring together;

  • There are a lot of maps in the code
  • Business logic and specific technical implementations are strongly bound and interdependent
  • There is no domain Model. Use PO as domain Model
  • There are no corresponding Models for API requests and returns

Refactoring technique

The clean-arch branch, which contains the commit for each refactoring step, can be checked out to the corresponding COMMIT

1. The hierarchical

1.1 create adapter and domain respectively create adapter/the inbound/rest/resources, adapter/outbound/gateway, Adapter/outbound/persistence, domain

1.2 to move the controller in the controller class moves to adapter/the inbound/rest/resources and according to the field of the subcontract

1.3 Mobile Repositories, Feign and Services

  • willcontrollerThe request input parameter is encapsulated as the corresponding xxRequest object.
  • willcontrollerThe return value data is encapsulated into the corresponding xxResponse object.
  • willrepositories/In theadapter/outbound/persistence/
  • willfeign/In theadapter/outbound/gateway/
  • willservices/The bottom is put indomain/
2. Disassemble PO and Domain Model

The models/into the domain to devolve into the corresponding fields under the package, and then copy a to the adapter/outbound/personal/renamed XXPO, then the object under the domain of the and persistence of related configuration all deleted.

3. Use DIP to remove direct dependencies on the persistence layer

At present, the data operation in domain service is still directly called jpaRepository, which needs to use DIP for dependency inversion.

  • willadapter/outbound/persistence/Under therepositoryrenamedrepositoryVendor
  • indomain/xx/Create the corresponding interfacexxRepository, will be called directly beforexxRepositoryMethods defined in the interface
  • inadapter/outbound/persistence/xx/Create the implementation classxxRepositoryImplimplementationxxRepositoryCorresponding functions
  • indomain/xxServiceDI injection is used inxxRepositoryliftjpaRepositoryDirect dependence of
4. Use the DIP to remove domain dependencies on the Gateway

Use the DIP principle for dependency inversion to remove domain layer’s direct dependence on gateway layer (Spring Feign)

5. Draw usecase
  • The business logic in domain Service is sorted out according to the requirement use cases and put into the use case layerapplication/usecases/In general, each business entity is disassembled intoeditandqueryTwo types of use cases, providing access to editing and querying, respectively. The original controller calls to the Domain Service depend on the correspondingxxUseCase
  • willdomain/notificationRelevant code is put intoapplication/gateway/notificationThe context of the notification is not a core service in the current service and is a third party service compared to the note service
6. Remove code where inner layers depend on outer layers

The outer layer can depend on the inner layer, but the inner layer cannot depend on the outer layer. For example, the application layer relies on xxRequest from the Adapter. This kind of data object dependency problem, you can pass the data inside the outer object as a basic type of parameter to the inner layer, if the parameter is too much, you can also create the corresponding xxDto, in the outer structure of the xxDto as a parameter to the inner layer.

Refactoring experience summary

Strict mode

If there is a significant gap in the level of coding within the team. Therefore, it is recommended to strictly follow the architectural pattern of the instance, that is, each request must be called usecase by the controller, and then the USecase calls the corresponding domain service. Even though usecase or Domain Service may be a simple layer of agents, writing in this pattern is verbose, but in the long run can be a good solution to code clutter.

Loose pattern

In order to avoid usecase or Domain service only acting as a layer of proxy, in loose mode, we can decide whether we need this layer or not, and whether we can skip this layer

  • whencontrollerIf only the simplest CURD exists, then the controller can inject the corresponding CURD directlyrepository
  • When there is a scenario in which multiple Domain Services need to be called, create the correspondingusecaseIn theusecaseTo complete the operation of the following
  • inusecaseAnd the data objects that appear in the Domain service are generally namedxxDtoIf thisxxDtoOnly in theusecaseSo it’s going to be inusecaseThis layer
  • The same layers can depend on each other
  • There should be no circular dependencies
Conversion of data across layers

Based on the principle that the inner layer cannot rely on data from the outer layer. If the outer layer wants to pass an outer data object to the inner layer, it is normal to disassemble the outer data object and pass the variables as arguments to the inner layer. If a large number of variables are passed, a corresponding xxDto can be created for the inner layer. Now the outer layer constructs this xxDto and passes this xxDto to the inner layer.

reference

  • Robert C. Martin. The Way to Neat Architecture. Translated by Sun Yucong, Publishing house of Electronics Industry