Discussion of component aggregation tension diagrams

During my lunch break on Wednesday, I shared a guide to Clean Architecture at ThoughtWorks’ Beijing office. When it comes to sharing component aggregation principles, many colleagues find it difficult to understand. The reason for this is that we cannot relate the consequences of component violations to real project problems, which leads to inconsistencies between principles and practice. The discussion was intense, but regrettably failed to reach a convincing conclusion. So in order to further clarify these issues, I decided to organize a panel on component aggregation principle tension. At master Wu’s instigated, the time was set at 8:30 next Thursday evening. Most of the participants were technical coaches of the consulting team, as well as clients of our project.

One of the first points of contention during the two and a half hours of discussion was the definition of components.

A component is the smallest unit of software deployment and the smallest entity that can independently deploy the entire software system during the deployment process.

The Big Devil challenges this definition: The Library cannot be deployed independently. When there are obvious logical holes, the best way is to leave the translation behind and go back to the original.

Components are the units of deployment. They are the smallest entities that can be deployed as part of a system.

After reading the original article, we found that “components are the smallest units of software deployment.” The smallest entity that can be deployed as part of a system, rather than the dynamic concept of the deployment process, would otherwise be a tautology. So there is nothing in this definition that says components can be deployed independently. Components can be linked to a separate executable or archive, or packaged as a.jar,.dll, or.exe file and deployed independently as dynamically loaded plug-ins.

Interpret the definition of a component

From the original text:

Components can be linked together into a single executable. Or they can be aggregated together into a single archive, such as a .war file. Or they can be independently deployed as separate dynamically loaded plugins, such as.jar or .dll or .exe files.

From the discussion:

20:56:56 From tianjie : These dynamically linked files, which can be plugged together at runtime, are the software components of our architectures.

After understanding the context, we know that components can be designed to be deployed independently, but not all components can be deployed independently. This should be clarified, otherwise the discussion of the principle of aggregation is prone to bias.

Wu went on to explain that a component should be a logical unit, not a physical unit. It is not appropriate to force a code module to be a physical unit of deployment. In addition, Uncle Bob made the same point when he introduced architectural boundaries: architectural boundaries are not service boundaries.

Interpreting REP principles

After I explained the PRINCIPLES of REP, CCP and CRP in my own way [1], the discussion soon focused on the interpretation and practical significance of the PRINCIPLES of REP.

Master Wu believes that if REP principle is simply interpreted as that it cannot be reused without the release process, it will not be balanced with the repellant force of CCP and CRP principle and cannot form a stable triangle relationship, so this tension diagram will be a little chicken ribs.

Inspired by the PRINCIPLES of CAP (Principles of Distributed Systems, Consistency, Availability, and Fault Tolerance of Partitions), Shanqi offers another direction of interpretation. He said that the CAP principle in the practice of distributed systems, will first take the P principle, and then balance between C and A. Therefore, in the triangle relation of REP, CCP and CRP, REP principle is equivalent to P principle here, which must be satisfied before CCP and CRP are selected.

The Big Devil understands that REP stands for reusability when components are independently reusable. Back in the days of Maven and dependency management, if we relied on packages that also depended on third-party packages, then that package wouldn’t be independently reusable.

REP means that when you release a reusable package, it is independently reusable. You can’t make me have to carry another JAR to use it. 21:14:04 From YangYun : The granule of reuse is the granule of release

‘If there are two packages that offer the same functionality, and one has no third party dependencies and the other does, then OF course I choose the former,’ he added.

Technical coach Sara gives a more complex but enlightening example.

Suppose the project contains sub Module ABC

  • If ABC sub module is not jar, and reuse each other directly, it is a violation of REP
  • If each sub module, typed as a JAR, is multiplexed with each other through a specific version of the JAR (such as the Snapshot version), it conforms to REP
  • If it is REP compliant and all sub modules are upgraded along with the entire project, it is CCP compliant because they are released as one
  • In this case, if A depends on B and C, I just want to change C this time, they upgraded version together. However, the Jar of B has not changed at all, this is an unnecessary release for B, and IT seems that B should be separated, but if it is separated, it will be further away from REP and CCP

On the last statement, she clarified:

I met A situation before, for example, component A needs to use A common library. The lib contains, for example, 3 sub modules (1/2/3), all of which need to be reused by A. At this time, if you want to change anything in 1/2/3, you will upgrade the lib together. Then in A corresponding upgrade version.

Later, there was some new component B that only needed 3 in common Lib, not 1/2, so 3 kept getting changed and packaged. At this point, the version number is increased by 1/2, but the content itself is not changed at all, just the version number is increased.

This scenario introduces two components, A and B, that rely on certain modules of the Common Library, respectively. The constraints are much simpler when we talk about a component dependency, but the purpose of reuse is to rely on multiple components, so this assumption is valuable.

Sara’s analysis is as follows:

If I separate it out, it’s equal to I have two common libs (1/2 and 3), and for B, I only need 3 lib, which is perfect, but I can change 3 and then I can change B.

However, for A, it needs to upgrade 1/2 lib and 3 lib at the same time, which is equal to 3 releases, whereas it only needs 2 releases (1/2/3 + A), so it is far away from CRP. At the same time, it also needs to maintain the upgrade of the two lib versions separately, so CCP is also worse than before.

In her analysis, we found that not only were CRP and CCP mutually exclusive, but that neither could possibly be satisfied. The reason for this result is that the common library formed by module 1/2/3 complies with CCP and CRP principles for component A, but does not comply with REP and CRP principles for component B, because every time you want to rely on module 3, I have to rely on 1/2/3 of the entire Common Library. On the other hand, if we separate 3 from 1/2/3 into separate components, it almost declares that component A must violate CCP and CRP, while component B has the benefit of complying with REP and CRP.

She then added:

In fact, I later had another idea when talking about corresponding micro-services. For example, a function like Mark to Market [2] is required for multiple components in my system, which is simply a formula, and there are usually several ways to do that

  • -> REP bad, CCP bad, but CRP not bad -> REP bad, CCP bad, but CRP not bad
  • Write the formula in a common lib and reuse it -> REP good, CCP Good, CRP bad
  • Put the formula in a separate service -> REP good, CCP bad(because there is one more service to maintain), CRP GOOD

This view goes up to different levels of reusability and can be considered as an exploration of the universality of the component aggregation principle.

When the topic is again focused on reusability, technology coach MoMo has a point: we are now talking about the principles that reusable components should follow, and REP is the definition of reuse granularity. As for components that have been using SNAPSHOT for years without a release concept, they should not be considered for reuse and are not anti-patterns for REP.

At the same time, Yan pointed out a translation error. The brief description of the REP principle in the adhesion tension diagram of components is “Combine for Reusability”, but the original text is actually “Group for reusers”, which means combine for Reusability. So in order for a reusable release, it’s a commitment to the outside.


tension diagram

External data

In between working overtime to write the proposal and discussing it, The big Devil did a quick review of some materials, such as the definition of REP principle on wiki:

21:45:05 From YangYun : Reuse-release Equivalence Principle (REP) REP essentially means that the package must be created with Reusable classes — “Either all of the classes inside the package are Reusable, Or none of them are “. The classes must also be of The same family. Classes that are are suggested to The purpose of The package should not be included. A package constructed as a family of reusable classes tends to be most useful and Reusable. – Wikipedia

In the wiki definition, it can be seen that the REP principle contains elements of CRP and CCP principles, so it seems that these three principles do not conform to the MCME classification principle. Even Uncle Bob is ambiguous in his book. This is basically a simplified description of CCP and CRP.

Then the big Devil found that the word “granularity” was defined in detail in software design, which is the decomposition and re-recognition of the REP principle definition (the minimum granularity of software reuse is equal to the minimum granularity of its distribution).

21:57:03 From YangYun: condor.depaul.edu/dmumaugh/OO…


granularity

21:58:29 From YangYun: fi.ort.edu.uy/innovaporta…


design principles

These ideas and academic advice are representative and worth pondering over and over again.

anti-patterns

Software engineers generally have a habit of “getting things wrong”. The principles are abstract, but the patterns are concrete, and the antipatterns guide the practice more. Next, there was a discussion about which antipatterns violated the REP principle.

The first is git submodule, which is a common way to divide and share modules by source code in some projects. Because the shared code is shared, every time the shared code is updated, it must be recompiled, distributed, and deployed by the dependent parties. This approach is painful for reuse.

Then there are some projects that have been using the SNAPSHOT version for years. These projects are typically characterized by reuse requirements between internal teams under a product team. The disadvantages are also obvious. Perennial SNAPSHOT is equivalent to no version and release process. Users do not know which are stable and which are modified in SNAPSHOT, whether the version they get is the latest or the remaining version, whether the functions I need are included in this function, or whether you have included too many upgrades THAT I do not need. This is also painful to reuse.

Summary of REP principles

Combining these two examples and other discussions leads us to an interesting conclusion: The REP principle is now a fundamental requirement of software engineering, and its existence may be a reflection of Uncle Bob’s age.


In the 2018-11-12


  1. Guide to architecture cleanliness (2) Component aggregation ↩

  2. Mark to market ↩