Refactoring Series: When does code need to be refactored?

preface

What is the much-maligned refactoring in everyday project development? Refactoring, in the eyes of many developers I’ve met, means taking the old software and spending the time designing and building something that looks exactly like the interface! In today’s society where efficiency is king, such a waste of human resources is not accepted by the society and business. Let’s seriously discuss what is the real reconstruction this time. Before we begin, let’s take a look at how the gods understand refactoring.

Noun explanation

Refactoring, n: An adjustment to the internal structure of software to improve its comprehensibility and reduce its modification costs without changing its observable behavior.

Refactoring (verb) : The use of a series of refactoring techniques to adjust the structure of software without changing its observable behavior.

In the past, many people used the term refactoring to refer to code cleaning. In fact, the key to refactoring is to make large-scale changes with lots of small but consistent software behavior steps. Each individual refactoring is either small or composed of several small steps. So if someone says their code is unavailable for a few days or more during refactoring, you can basically be sure that what they are doing is not refactoring.

Why is it needed

Improve software design

Without refactoring, the internal design of a program will deteriorate over time. In the domestic many technology companies have speed is king, the company in order to early online, oppression to shorten development time, the developer to come home from work earlier to less overtime, many developers in change bug modified code, often not fully understand the process of architectural design, is a temporary fix, so three zhang about Li Sigai tomorrow today, The code gradually loses its structure. Over time, it becomes harder and harder for programmers to read the code to understand the original design, and the code corrupts more and more quickly, eventually becoming a project that no one can touch.

Improve programming speed

Early in the project, when complexity and code corruption levels are not at their peak, development can move quickly at first, giving the company leadership the illusion that this is how real development should be, and that it should always be. At a certain point, however, it takes a lot longer to add a new feature than it used to, and developers need to spend a lot more time thinking about how to fit the new feature into the existing code base to avoid the embarrassment of having to change a place. The whole project code base looks like to fill the patch on the patch, like archaeology is needed to understand how the system works, the burden of constantly slow down the speed of new features, when was the last straw will put forward at the end of the programmer “otherwise, we have to do a new version to give up this project now?” .

So, how corrupt is a project’s code that we should start refactoring? Martin Fowler, in his book Refactoring, has identified 24 useful refactoring opportunities. Let’s take a look at each one and see if you’ve made any of these mistakes while writing code.

When refactoring

1. Mysterious naming

A good name lets the reader know what the code means at a glance, and the most important part of clean code is to start with good names. In reality, many people don’t want to change the name of their program and think it’s not worth the time, but a good name can save a lot of time on future puzzles, so when a mysterious name appears in a project, it’s the beginning of refactoring. For meaningful code naming, see Code Neatness, Chapter 2 meaningful Naming.

2. Duplicate code

In the mind of a good developer, the rule is that when a piece of code is used in more than three places, it is time to start refactoring. Three is not an absolute value. Rather, if you see the same code structure in more than one place, you should start refactoring immediately. Many novice developers like to develop by copy and paste when writing code, which will cause more unnecessary trouble for others to maintain and modify the code later.

3. Overlong functions

I’ve been told, but whenever a function is more than 50 lines long, you should start thinking about refactoring it. In my previous development experience, functions with smaller granularity live the longest and best, and they follow the single-function principle. When a function is too long, it becomes more complex and difficult to understand, so many developers like to add more comments to the long function to explain the program. This is a mistake, too much comment content is easier to read and understand the code. So when we feel the need for comments to illustrate code logic, we should abstract out the comments and write them in a separate function with a nice, intuitive name.

4. The parameter list is too long

When I was writing Angular1 code, I often had to inject a lot of dependent objects in the controller and pass a lot of parameters in the callback function, which caused me to have a lot of hand shaking or a few bugs in the wrong order. Long argument lists are often confusing and error-prone. We can reduce the long argument lists by passing objects to make our code simpler to read by merging multiple arguments into one object.

5. Global data

Using global data in a project is likely to cause data pollution and conflict. In many code specifications, the use of global data is explicitly prohibited. Global data can be modified from any location in the code base, making it exceptionally difficult to detect bugs. In practical development, global data can be abstracted and encapsulated, and manual import operations can be carried out where data is used.

Mutable data

Js for weak type development language, the problem of variable data is easy to happen, after the update data in one place, was unaware of the software of another expecting to totally different data types of data, and then change out of a BUG, this happens in very rare cases, so want to find out the cause of the failure will be more difficult, So now Ts is becoming more and more popular in the front-end tech world.

7. Divergent variation

At the beginning of software development, we want the software to be easier to change, so that when we need to change it, we can jump to a point on the system and make changes only at that point. Divergent changes occur if a module often changes in different directions for different reasons.

8. Spread the changes

Scattered changes are similar to divergent changes, but opposite. If you have to make a lot of small changes in a lot of different classes for every change, the problem is scattered changes, where your code is walking around in multiple places, and it’s hard to find them, but it’s also easy to miss an important change. This is when you need to encapsulate and abstract, or move some code into the same module to achieve centralized changes.

9. Attachment

Developers who have done object-oriented programming tend to write code with high cohesion, low coupling, open closure, and so on in mind. Sometimes, however, you will find that a function communicates more frequently with functions or data in another module than it does with access within its own module. This is code attachment. We can do this by moving this part of the code inside the module we are accessing, or we can take another layer of abstraction and break up the function into smaller functions and place them in different locations.

10. Data blob

I used to maintain other people’s code and see the same strings and arguments scattered all over the code in many places, making it difficult to maintain such code. All we need to do is abstract the scattered data, extract it into a separate object, and then access it by importing objects, so that we can reduce a lot of the duplication and make changes to the code that are not scattered.

11. Basic type paranoia

Primitive type paranoia is a common problem in projects with code corruption, where prices can be represented as floating-point types and coordinate ranges can be encapsulated as objects that are actually stored and displayed as strings. Strings seem to be an all-purpose data type that can be presented with the correct type instead of the type-paranoid code.

12. Duplicate switch

A lot of repeated nested ifs and repeated switches are a common way to deal with problems in novice code when it comes to conditional or branch judgments. The problem with these duplicate switches is that when you want to add a selection branch, you have to find all the switches and modify them one by one. For repeated branch judgments, we can use polymorphism instead of conditional expressions, which makes the code more elegant.

13. Loop statements

In most business scenarios, looping statements are not necessary, and many newcomers like to write a lot of for loops to fulfill their requirements. We can actually use pipes instead of loops, such as filters, maps, forEach, etc., to help us see the elements being processed and their actions more quickly.

14. Redundant elements

In project development, in order to support changes and promote code reuse, we often carry out code abstraction and encapsulation, but it is often difficult to master the degree of abstraction at this time, resulting in excessive encapsulation. A method name may look exactly the same as the implementation code, an abstract class may be a simple function, and so on, resulting in code redundancy. We can reduce code redundancy by releasing overly abstract code through merging.

15. Talk about generality

Also, excessive design code, will appear a variety of special circumstances, to deal with some very rare or unnecessary things, actually these special cases will only make it harder for the system maintenance and understanding, so at the beginning of the design is consider to be clear about if really can do this is worth it, if can’t use, It just makes the code more prone to bad taste.

16. Temporary fields

In a multi-person project, you may have seen someone define a temporary field in the code, but this field is only used in certain situations, and if the code is not in the branch, such a temporary field can be very misleading. We can abstract these temporary fields into a special class, then move the fields and the associated code into that class, and call the methods in that class where necessary.

17. Excessively long message chains

When the code written by others to get the data, we see its logic, found you access objects, is the agent of another object, another object request again another object, then a long message chain, for if you try so hard to fix a BUG, meet such code may make you mad. When the message chain is too long, we can shorten the depth of the code by abstracting and moving the code into a class to truncate the long message chain, making the code much more readable.

Middlemen 18.

One of the hallmarks of object-oriented programming is encapsulation, and encapsulation often comes with delegation. But if you’ve seen a class whose interface delegates half of its functions to other classes, you’re overusing encapsulation. At this point we should remove the man-in-the-middle code and interact directly with the real implementation object.

Insider trading

In practical development, many people implement high cohesion of modules, but the convergence will inevitably increase the amount of data exchanged between modules, which will increase the coupling between modules. If there is always a private exchange of data between the two modules, it is necessary to find the common data between the two modules, abstract it as a data intermediary between the two modules, and put this exchange behavior in the light.

20. Oversized classes

The biggest contributor to bloated code is a very large class. A class has a lot of work to do, a lot of fields to do, a lot of duplicate code to do, a lot of code to mess up and eventually die. For this kind of class, we need to use abstraction ability to extract the code, extract the same business into the corresponding class, extract the common content into the superclass.

21. Similar classes

In iterating through multiple versions of code, there are a lot of similar classes. In general, everyone seems to be doing the same thing, with only minor differences, such as some parameter differences. These classes are filled with a lot of duplicate code. We can change the declaration of the function to make the parameters of the function consistent, and then combine them. If there is duplicate code, we can compensate by abstracting it into a superclass.

22. Pure data classes

For administrative purposes, some experienced programmers will put bits and pieces of data into a special pure data class that has a bunch of fields and functions to access those fields. But these pure data classes are useless except for accessing and reading these fields, and we should move the behavior of processing data from other places to pure data classes.

23. Rejected bequests

When maintaining some code with inheritance relationship, it is often found that a subclass inherits from the parent class and gets all the functions and data of the parent class, but the subclass only gets a value from the parent class, and the other functions and values of the parent class are rejected by the subclass. This means that the inheritance system is wrong. We need to create a new sibling for this subclass, and then push functions and values that are not needed to that sibling class, so that the parent class holds things shared by all subclasses.

24. The annotation

Good code with built-in interpretation. But in reality, most of the comments are there because the code is so bad that they have to be commented in order to supplement them. So, when you feel the need to write comments, try refactoring first, refining functions and changing function declarations, and try to make all the comments redundant.

That’s it for the 24 Bad Smells of code theory, and in the next article, we’ll start rehearsing and practicing in earnest!