preface

Recently, we have been focusing on electric vehicles, and most of them have proposed the transformation from platform to modular. The meaning of modularization is to maximize the design reuse, with the least modules, parts, more quickly to meet more personalized needs. Modularization is not only the modularization of design, but also includes the modularization technology of process, the modularization technology of manufacturing, the modularization technology of delivery and service. The significance for enterprises lies in reducing the number of parts, reducing deformation design, receiving more orders, and enhancing the core competitiveness of enterprises.

“Modular” in software engineering is also very common, many tools for analyzing software architecture (metrics, fitness function, visual) are dependent on the concept of modular, modular is a kind of organizing principle, if you didn’t pay attention to when designing the system each part is how to connect together, and then build system will eventually lead to many problems. The importance of “high cohesion, low coupling” code practices has been repeatedly emphasized in books like The Code Book and The Code Way.

1. Definition of modularity

So-called “module” is used to build a more complex structure of one of a set of standardized parts or independent units, modular to describe the logical grouping of the code, the logical grouping can is a set of classes in object-oriented languages, also can be the function in the structural and functional languages, most language provides a modular mechanism in the bag, Developers often use modules as a way to group related code together.

Most languages provide a modular mechanism for packages, such as packages in Java, such as the com.xxx.customer package, which should contain customer-specific content. In designing a system, the need to understand how packages are planned has important implications in architectural design. If several packages are tightly coupled, it becomes more difficult to reuse one of them for other modules.

When discussing architecture, generally applicable modular as a generic term to represent the code groups, classes, functions, or other grouping, this does not mean to carry on the physical separation, the only logical separation, such as a large number of classes in a single application is very convenient, but need to design architecture, by coupling of loose zone became split monomer application obstacles, therefore, It is important to discuss modularity as a concept of physical separation enforced or implied on a particular platform.

2. Modular metrics

Modularity is usually analyzed and measured from three dimensions: cohesion, coupling and symbiosis.

2.1 internal cohesion

Cohesion refers to the degree of association of various parts in a module, which measures the degree of management of various parts in a module. Ideally, a cohesive module wraps all the parts together, and splitting them into smaller parts requires the calls between modules to be coupled together. Attempts to split cohesive modules will result in increased coupling and decreased readability. Cohesion includes the following common cohesion:

2.1.1 Classification of cohesion

Functional cohesion: Each part of a module is related to each other, and each module contains everything necessary for functionality. For example, a math.max () method is a subroutine that performs an operation that matches its name. If it does something else, it’s not cohesive enough.

public class Math { public static int max(int a, int b) { return (a >= b) ? a : b; }}Copy the code

Sequential cohesion: Two modules interact. The output of one module serves as the input of the other module, and the user information is obtained according to the ID, and then the employee information is obtained. If these two processes are separated, the caller needs to call them twice and combine them together to reduce the readability of the caller.

public Employ getEmployInfo(String userId) {

    User user = userService.getUserById(userId);
    if (user == null || StringUtils.isEmpty(user.getWorkNo()) {
    	return null;
    }
    return employService.getEmployByWorkNo(user.getWorkNo());

}
Copy the code

Contact cohesion (information cohesion/communication cohesion): Two modules form a communication chain, each of which operates or contributes to output based on information, such as adding a record to a database and generating E-mail based on that information.

public void handleUpdate(User user) {
    User user = userSevice.updateById(user);
    if (user == NULL || StringUtils.isEmpty(user.getEmail())) {
      return; 
    }
    mailService.sendEmail(user.getEmail(), "xxxx");
}
Copy the code

Time cohesion: module based on temporal dependence, for example, many systems have seem irrelevant things, must be initialized at system startup the, the different tasks with cohesion, such as the Spring framework of the boot process, the Bean container module to start the first initialization and then take other initialization module.

@Override
public void refresh() throws BeansException, IllegalStateException {
	  	prepareRefresh();
   		ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();
   		prepareBeanFactory(beanFactory);
        postProcessBeanFactory(beanFactory);
      invokeBeanFactoryPostProcessors(beanFactory);
      registerBeanPostProcessors(beanFactory);
      initMessageSource();
      initApplicationEventMulticaster();
      onRefresh();
      registerListeners();
      finishBeanFactoryInitialization(beanFactory);
      finishRefresh();
 }
Copy the code

Logical cohesion: Data within a module is logically similar but functionally unrelated, as in common message processing, different logical processing is used according to the type of message content.

void handleMessage(MessageBody body) {
    switch(body.getType()) {
    case TYPE1: 
    	handle1(body);
    	break;
    case TYPE2:
    	handle2(body);
    	break;
    default:
    	...
}
Copy the code

Temporary cohesion: A program contains operations that need to be put together when they are executed. Example initialization function, there is no

Public void init() {there is no connection between the following operations, but the program needs to be started: initIdlist(); InitErrorList (); // Use a subroutine instead of writing tedious initialization code here; . }Copy the code

Accidental cohesion: Elements in a module are unrelated to each other except in the same source file, which is the worst form of cohesion and should be avoided as much as possible.

The degree of cohesion of a particular module requires consideration of special cases. The following module defines the customer maintenance module, which can also be split into two separate modules such as customer module and order module. There’s a bit of a debate here, because every change has an impact on development implementation costs.

  • Order module only two operations, if this is not in the customer maintenance module can, there is no need to fight.

  • Whether the client module is expected to contain more functionality, encouraging developers to refine more business behavior.

  • Does the order module need to hold so much customer information that there is still a high degree of coupling when the two modules are separated?

2.1.2 Cohesion metrics

A structured measure of cohesion has been developed in the computer field. C&K class cohesion deficiency degree is used to measure the structural cohesion of modules or components. The calculation method is as follows.

LCOM definition: the sum of methods that share without sharing fields. As shown in the figure below, fields are displayed as single letters and methods as blocks. In class X, LCOM score is low and LCOM score is low, indicating that its structural cohesion is good, while in class Y, each combination of fields or methods can appear in its own class, which lacks cohesion and will not affect other classes. Class Z demonstrates the cohesion of mixing, allowing developers to recombine the last field or method into a single class.

LCOM allows you to scientifically evaluate whether you can switch from one architectural style to another. A common problem when modifying an architecture is sharing utility classes. Using LCOM can help you find coupling classes that you might otherwise accidentally create.

2.2 the coupling

Coupling mainly includes afferent and efferent coupling. Afferent coupling refers to measuring the number of input connections to code artifacts (components, classes, functions, etc.), and efferent coupling refers to measuring the output connections used to connect to other code artifacts. It is useful for reorganizing, migrating, and understanding code bases.

2.3 Abstractness, instability and distance of main sequence

Abstraction is the ratio of abstract artifacts (abstract classes, interfaces, and so on) to concrete artifacts (implementations). It represents an indicator of the realization of abstractness contrast. For example, a code base that doesn’t have any abstract code and contains only one large, single method (doing all the implementation in a single main() method), as opposed to a code base that has so much abstract code that it’s hard for developers to understand how things are tied together. Abstractness can be expressed by the following equation:

In this equation, the abstract elements (interfaces or abstract classes) represent the module and the concrete elements (non-abstract classes) represent the module. For example: for an application with 10,000 lines of code, all of which are in a main() method, the abstraction is 1, and the denominator is 10000, yielding almost zero abstraction.

Another derivative metric is instability, defined as the ratio of the sum of efferent (or output) coupling to afferent (or input) coupling. The following equation can be seen:

The instability metric represents the variability of code, and because of high coupling, highly unstable code is more likely to be broken when the code base is modified. For example, if a class calls many other classes to delegate work, the called class is vulnerable to problems when one or more of the called methods change.

The distance from the main sequence can be expressed as follows:

In this equation, A represents abstraction and I represents instability, where both abstraction and instability are ratios, so D is always between 0 and 1. The relationship between the two can be represented by two-dimensional coordinates, where the distance index imagines the ideal relationship between abstractness and instability, and the class close to the ideal line represents the perfect fusion of these two contradictory concerns, well balanced. Marking the abstractness and instability of a class in a diagram can help the developer calculate its distance from the main sequence as shown below.

Mark a class point in the figure below and measure the distance of that point from the ideal line. The closer the distance line, the better the balance of the class. Classes falling into the upper right corner represent code that is too abstract to use, while code falling into the lower left corner represents too much implementation but not enough abstraction, and code that is brittle and difficult to maintain.

These equations and methods are very instructive in migrating code or technical debt assessment for analysis, replacing previous subjective feelings with quantitative metrics.

2.4 symbiotic

Definition If a change in one component requires another component to maintain the overall correctness of the system, the two components are symbiotic and can be classified as follows:

  • Static symbiosis: code level coupling
  • Dynamic symbiosis: Coupling of components during execution, with the execution order in mind

2.4.1 Static symbiosis

  • Name symbiosis: Multiple components must agree on entity names, and method names are the most common way code is coupled and the easiest to change, especially with modern refactoring tools that make system-wide changes much easier and easier.

  • Type symbiosis (COT): Multiple components must agree on the entity type. This type of symbiosis refers to the practice of limiting variables and parameters to a specific type, which usually exists in many static languages. However, this is not the prerogative of statically typed languages; some dynamically typed languages also provide optional types.

  • Meaning symbiosis or convention symbiosis: Multiple components must be in agreement on the entity type, this type of symbiotic refers to, usually exist in many static language class limit variables and parameters for specific types of practice, this type in the code base of symbiotic the most common and obvious example is not constant, but the hard coded Numbers, for example, in some languages, usually defined int TRUE = 1; Int FALSE = 0.

  • Location symbiosis (CoP): Multiple components must agree on the meaning of a particular value, which is a matter of parameter values for method and function calls, even in languages with static typing, If you create a method updateUser(String uid, String name) and call it with the value updateUser(“123”, “test”, 12), the semantics are incorrect, even if the type is correct.

  • Algorithmic symbiosis (CoA): Multiple components must agree on a particular algorithm. A common type of this type is that this type of symbiosis occurs when a developer defines a secure hash algorithm that must run on both the server and the client to authenticate users by determining whether the same result class is produced. Obviously, this represents a high-level form of coupling, and if one side changes any algorithmic details, the handshake is no longer valid.

2.4.2 Dynamic symbiosis

  • Execution symbiosis: The execution order of multiple components needs to be considered. The result of the code is incorrect because it must be initialized in the necessary order.

    email = new Email(); email.setRecipient(“”); email.sentSender(“”); emain.send(); email.setSubject(“test”);

  • Timing symbiosis: Consideration of the timing of the execution of multiple components, common in multithreaded implementations, resulting in races when two threads execute simultaneously, thus affecting the outcome of a joint operation.

  • Value symbiosis: When multiple values are related, changes are bound to occur together, especially in distributed systems where some transactions may update a value across all databases.

  • Identity symbiosis: WHEN multiple values are related, a common example of this type of symbiosis is when two independent components must share and update the same data structure, such as distributed queues.

2.4.3 Properties of symbiosis

  • Strength: Use static rather than dynamic symbiosis, identify static symbiosis through simple source code analysis, and improve code by reconstructing code into name symbiosis by replacing magic with constants containing names.

  • Location: Symbiotic position measure each module of the distance between each other in the code base, segmentation of the code and its neighbors, compared to neighbor code usually has more and higher level form symbiotic, the farther the distance to each other, different symbiotic represents the coupling is the worse, such as two classes of the same components as meaningful symbiotic with meaningful symbiosis between the two components, Two classes that are meaningfully symbiotic are less disruptive to the code base.

  • Degree: The degree of symbiosis depends on the extent of its influence – does it affect several classes or does it affect many classes? A smaller degree of symbiosis is also less destructive to code loss.

Suggestions for improving system modularity and reducing symbiosis:

  • The system is decomposed into encapsulation elements to minimize global symbiosis

  • Minimize any residual symbiosis across encapsulation boundaries

  • Maximizes symbiosis within encapsulation boundaries

Coupling and symbiosis are measures from different eras and different goals, and the two concepts overlap. Structured programming is about inputs or outputs, whereas symbiosis is about how things are coupled together. Structural programming has the concept of coupling on the left, symbiosis on the right, and static symbiosis in structured programming is known as data coupling (method invocation).

3. Summary

By definition of a modular system, this paper expounds the cohesion/coupling symbiosis three dimensions related definition and scenario analysis and measurement of Angle, try to use the formula or mathematical method to quantitative analysis of three dimensions, provides more scientific decision basis, in our ways of improving the system modular level, code quality good guiding significance.

Reference: Software Architecture-A Guide to Architectural Patterns, Characteristics, and Practices