This article summarizes my development experience and Martin Fowler’s book refactoring, Improving existing Code Design, in the hope that it will help more developers understand that refactoring is not the hard work they think it is, but it can also be simple. Commit a feature, review and refactor.
1. What is refactoring
Refactoring is defined as improving the design of existing code.
Specifically, an adjustment to the internal structure of the code without changing its functional behavior. It is important to note that refactoring is not about code optimization. Refactoring is about making your code more understandable and extensible, which can have a positive or negative impact on performance. While performance optimizations make the program run faster, of course, the resulting code may be harder to understand and maintain.
2. Why refactor
2.1. Improve the internal design of the program
Without refactoring, the design of the code will only become more corrupt as the software continues to iterate, making development difficult.
There are two main reasons for this:
- People only modify the code for short-term purposes, are often not fully understanding of the overall architecture design (in large projects often have this kind of situation, in different places, for example, use the same statement to do the same thing), the code will lose their structure, the loss of the code structure has a cumulative effect, the more difficult it is to see the code to represent design intent, The harder it is to protect the design.
- It is almost impossible to make a perfect design in advance to face the unknown feature development, only in practice can we find the truth.
So refactoring is essential if you want decent and fast functionality.
2.2. Make the code easier to understand
In development, we need to understand what the code is doing before we can change it, and many times we forget the implementation of our own code, let alone the code of others. Maybe there’s some bad conditional logic in this code, or maybe the variable names are so bad and really commented that it takes a while to figure out what’s going on.
Proper refactoring makes code “self-explanatory” and easy to understand, both for collaborative development and to maintain functionality previously implemented by yourself, and has an immediate effect on code development.
2.3. Improve the speed of development && convenient to locate errors
Improve the speed of development may be a bit “intuition”, because the refactoring in many cases appears to be the extra workload, no new functions and features of output, but reduce the amount of writing code (reuse module), convenient positioning error (fine) code structure, these can let us save a lot of time in development, in the subsequent development of “travel light”.
3. Principles of refactoring
3.1. Maintain current programming status
Kent Beck has come up with the “two hats” metaphor, in which you divide your time between two distinct behaviors when developing software: adding new features and refactoring. When adding new features, you shouldn’t change the existing code, just add new features and get the program to work correctly. When you refactor, you can’t add new features, just adjust the structure of the code, and only change the relevant code when absolutely necessary.
In the course of development, we might change the hat a lot, and as we add new features, we realize that if we change the structure of the program, it will be much easier to add features, or more elegant to implement, so we change the hat a lot, refactoring and adding new features as we go. It’s easy to get confused and have trouble understanding your own code.
We need to know which hat we are wearing at all times and focus on our programming state. This gives us a clear goal and control over the process and gives us control over our coding progress.
3.2. Controllable refactoring
Refactoring doesn’t happen overnight, and if it interferes with your control of your time, your control of your function, then you should stop and think about whether it’s worth it. We must ensure that the program is available and time controlled, and we must make sure that our steps are small, make sure that every step is managed by Git and code testing, otherwise you will get stuck in the middle of the program is not available, and even worse, you will forget what the code looked like before!
There’s more on that in a later section of this article, when to start refactoring, but I’ll skip it here.
4. Identify the stink of the code
The rules of refactoring the world are already known, and there’s a refactoring list below, north and South. It’s time to go back and look at the pieces of code, recognize their stenches and eliminate them!
Of course, if you think the content is too long, you can skip it, or you can skip it, and it is also a good choice to review later.
4.1. Mysterious naming
I’ll admit, in detective novels, trying to guess the plot through the cryptic text is a great experience, but in code, it often bothers programmers! It takes a lot of time to figure out what a variable does and what a function does, or even to add a lot of comments to the snippet.
This is not a criticism of comments, but a good code snippet and code name often makes the code self-explanatory, eliminating unnecessary comments, and reading the code as smoothly as reading text.
As a result, variable naming is really the first step in any refactoring, but unfortunately it’s one of the most difficult things to do in programming.
- You need a balance between brevity and naming length.
- Need uniform variable naming style, especially for an entire team! Because variable naming is often not part of code style detection!
- We need the names of variables to be related to each other and to recognize each other’s information. Imagine that they exist in a code snippet* * * *
cgi
* * * *and* * * *cgiList
* * * *And so on, you can read the correlation directly from it, ifcgi
* * * *and* * * *list
* * * *The connection between them is lost, or at the same time* * * *people
* * * *and* * * *human
* * * *Two variables. Does that confuse you? - Good Command of English is required.
There is no precise and detailed tutorial for variable naming, and it is difficult to enforce uniformity. Generally, the following three points can be complied with.
- meaningful
- The associated
- Do not reuse
Practice is the only test of quality, if your variables can let other students know, you are right!
4.2. Duplicate code
Extracting duplicate code is one of the most classic techniques in refactoring. Many times we will write similar code in different places, or make a copy into the current context, with very few differences between them.
The tricky problem is that when you need to change the functionality, you have to find all the copies and change them, which makes it easy for people to read and change the code and make mistakes. So we’re going to refuse to reinvent the wheel and try to achieve highly reusable code.
We can separate it into a public function and name it after its function.
4.3. Too long function
The longer the function, the more difficult it is to understand, and with it comes high coupling, which makes it difficult to disassemble and reassemble.
It is generally accepted that the number of lines of code should not exceed the size of one screen, because this will cause scrolling up and down, which will increase the probability of error. According to Tencent code specification, a function should not exceed 80 lines of code.
How about looking directly at the following two pieces of code that do the same thing without having to understand what they mean (which doesn’t mean anything) and simply comparing the visuals?
/ / before the refactoring
function changeList(list) {
console.log('some operation of list')
for (let i=0; i<list.length; i++) {
// do sth
}
console.log('conditional judgment')
let result
if (list.length < 4) {
result = list.pop()
} else {
result = list.shift()
}
const today = new Date(Date.now())
const dueDate = new Date(today.getFullYear(), today.getMonth(), today.getDate() + 30);
result.dueDate = dueDate
return result
}
/ / after the refactoring
function changeList(list) {
console.log('some operation of list')
operationOfList(list)
console.log('conditional judgment')
const result = judgment(list)
result.dueDate = getRecordTime()
return result
}
function operationOfList(list) {
for (let i=0; i<list.length; i++) {
// do sth
}
return list
}
function judgment(list) {
let result
if (list.length < 4) {
result = list.pop()
} else {
result = list.shift()
}
return result
}
function getRecordTime() {
const today = new Date(Date.now())
const dueDate = new Date(today.getFullYear(), today.getMonth(), today.getDate() + 30);
return dueDate
}
Copy the code
It turns out that splitting functions leads to better and faster understanding of the code, as well as less coupling, making it easier to “reassemble” new functions. Of course, you may find this cumbersome and unnecessary, but the goal of refactoring is to make your code readable. If one day you want to modify or add functionality to this function, you’ll thank yourself for the refactoring.
4.4. Data slime && too long parameter
A data muddle (magic number) is, as its name suggests, a random collection of data that can be difficult to control.
If there are multiple parameters that match each other, or if certain data always appear in a group, we should shape the clay into a concrete image and encapsulate it as a data object.
function addUser(name, gender, age) {
// some other codes.// officeAreaCode is paired with officeNumber. If officeNumber is missing, then officeAreaCode is meaningless
clumps.officeAreaCode = '+ 86'
clumps.officeNumber = 13688888888;
return person
}
/ / after the refactoring
class TelephoneNumber(officeAreaCode.officeNumber) {
constructor() {
this.officeAreaCode = officeAreaCode
this.officeNumber = officeNumber
}
}
// Parameter fusion
function addUser(person) {
// some other codes.// Encapsulate the data
person.telephone = new TelephoneNumber('+ 86'.'13688888888')}Copy the code
4.5. Global data
Most of the time, we inevitably use global data, even if only a variable, global data to our management of higher requirements. Because even a small change can cause problems in many places, and even more frightening is to inadvertently trigger the change.
Because every function has access to these variables, it becomes increasingly difficult to figure out which function actually reads and writes them. To understand how a program works, almost every function that modifies the global state must be considered, making debugging difficult.
If you don’t rely on global variables, you can rely on the state passed between different functions, so you can better understand what each function does, because you don’t have to worry about global variables.
let globalData = 1
// bad
function foo() {
globalData = 2
}
// bad
function fuu() {
globalData = {
a: 1}}Copy the code
Now, we’ll do some encapsulation of global data to control access to it.
// Use the constant good
const constantData = 1
// Encapsulate the variable good
let globalData = 1
function getGlobalData() {
return globalData
}
function setGlobalData(newGlobalData){
if(! isValid(newGlobalData)) {throw Error('Illegal input!!!')
return
}
globalData = newGlobalData
}
// Exposure method
export {
getGlobalData,
setGlobalData
}
Copy the code
Now, global variables are not easily “touched”, and can be quickly defined where to change them and prevent incorrect changes.
4.6 divergent variation
Divergent variation occurs when a function changes in different directions for different reasons. This sounds a little confusing, so let’s explain it in code.
function getPrice(order) {
// Get the base price
const basePrice = order.quantity * order.itemPrice
// Get a discount
const quantityDiscount = Math.max(0, order.quantity - 500) * order.itemPrice * 0.05
// Get the freight
const shipping = Math.min(basePrice * 0.1.100)
// Calculate the price
return basePrice - quantityDiscount + shipping
}
const orderPrice = getPrice(order);
Copy the code
This function is used to calculate the price of an item. Its calculation includes the base price + quantity discount + shipping cost. If the calculation rules of the base price change, we need to modify this function. If the discount rules change, we need to modify this function; If the freight calculation rules change, we still have to modify this function.
These changes can be confusing, and of course we want to be able to jump to a certain point in the system once the program needs to be changed, so it’s time to pull them out.
// Calculate the base price
function calBasePrice(order) {
return order.quantity * order.itemPrice
}
// Calculate the discount
function calDiscount(order) {
return Math.max(0, order.quantity - 500) * order.itemPrice * 0.05
}
// Calculate the freight
function calShipping(basePrice) {
return Math.min(basePrice * 0.1.100)}// Calculate the commodity price
function getPrice(order) {
return calBasePrice(order) - calDiscount(order) + calShipping(calBasePrice(order))
}
const orderPrice = getPrice(order)
Copy the code
Although the number of lines of this function is not large, the reconstruction process of this function is consistent with that of the previous over-long function, but the separation of each function is beneficial to more clearly locate the problem and modify. So a long function has multiple stinks! Need to be eliminated in time.
4.7. Shotgun modifications
Shotgun changes and divergent changes don’t sound very different, but they are Yin and Yang. Bullet fix is a bit like repeating code. When we need to make a small change, we have to go around and fix it one by one. Not only is it hard to find it, but it’s also easy to miss an important change until something goes wrong!
// File Reading.js
const reading = {customer: "ivan".quantity: 10.month: 5.year: 2017}
function acquireReading() { return reading }
function baseRate(month, year) {
/ * * /
}
// File 1
const aReading = acquireReading()
const baseCharge = baseRate(aReading.month, aReading.year) * aReading.quantity
// File 2
const aReading = acquireReading()
const base = (baseRate(aReading.month, aReading.year) * aReading.quantity)
const taxableCharge = Math.max(0, base - taxThreshold(aReading.year))
function taxThreshold(year) { / * * / }
// File 3
const aReading = acquireReading()
const basicChargeAmount = calculateBaseCharge(aReading)
function calculateBaseCharge(aReading) {
return baseRate(aReading.month, aReading.year) * aReading.quantity
}
Copy the code
In the above code, if the reading logic changes, we need to adjust it across several files, which can easily cause omissions.
Since reading is being operated on everywhere, we can encapsulate it and manage it in a single file.
// File Reading.js
class Reading {
constructor(data) {
this.customer = data.customer
this.quantity = data.quantity
this.month = data.month
this.year = data.year
}
get baseRate() {
/ *... * /
}
get baseCharge() {
return baseRate(this.month, this.year) * this.quantity
}
get taxableCharge() {
return Math.max(0, base - taxThreshold())
}
get taxThreshold() {
/ *... * /}}const reading = new Reading({ customer: 'Evan You'.quantity: 10.month: 8.year: 2021 })
Copy the code
All the related logic together not only provides a common environment, but also simplifies the call logic and makes it clearer.
4.8. For loop statements
It’s surprising that loops, which have always been the core element of a program, have become a stink in this refactoring world. This is not to abolish loops, but just using regular for loops is a bit outdated these days, and we have good alternatives. In the JS world, there are pipe operations (Filter, map, etc.) which can help us better process elements and help us see the action of processing.
Next we will pick out all the programmers in the crowd and record their names. Which is more pleasing?
// for
const programmerNames = []
for (const item of people) {
if (item.job === 'programmer') {
programmerNames.push(item.name)
}
}
// pipeline
const programmerNames = people
.filter(item= > item.job === 'programmer')
.map(item= > item.name)
Copy the code
Of course, at this point you might want to point out the difference in performance, but don’t forget that the point of refactoring is to make the code cleaner, and performance is not a priority here.
However, it is a pity to tell you that only a few pipe operators support the reverse operation (reduce, reduceRight), and more often than not, the reverse operation must be used to reverse the array. So whether you want to outlaw the for loop or not is up to you and depends on the actual scenario.
4.9. Complex conditional logic && Merges conditional expressions
Complex conditional logic is one of the places where complexity increases. The code tells us what is happening, but we often can’t figure out why it is happening, which proves that the code is much less readable. It’s time to encapsulate them into a function with instructions.
// bad
if(! date.isBefore(plan.summberStart) && ! date.isAfter(plan.summberEnd)) { charge = quantity * plan.summerRate }else {
charge = quantity * plan.regularRate + plan.regularServiceCharge
}
// good
if (isSummer()) {
charge = quantity * plan.summerRate
} else {
charge = quantity * plan.regularRate + plan.regularServiceCharge
}
// perfect
isSummer() ? summerCharge() : regularCharge()
Copy the code
If a list of conditions is checked, and the conditions are different, but the final behavior is the same, then we should use logic or and logic and to combine them into one condition expression. Then do the above code logic, encapsulation!
if (man.age < 18) return 0
if (man.hasHeartDisease) return 0
if(! isFull)return 0
// step 1
if (man.age < 18&& man.hasHeartDisease && ! isFull)return 0
// step 2
if(isIlegalEntry(man) && ! isFull)return 0
Copy the code
4.10. Coupling query function with modify function
If a function just provides a value with no side effects, this is a valuable thing. I can call the function at will with no worries, and I can move the function at will. All in all, there’s a lot less to worry about.
It’s a good idea to explicitly separate “side effects” and “no side effects” functions, and it’s time to separate the query and modify functions that come together all the time in normal development!
// Send an email to five star employees under the age of 2
function getTotalAdnSendEmail() {
const emailList = programmerList
.filter(item= > item.occupationalAge <= 2 && item.stars === 5)
.map(item= > item.email)
return sendEmail(emailList)
}
// Separate the query function, where you can further control the query statement by passing parameters
function search() {
return programmerList
.filter(item= > item.occupationalAge <= 2 && item.stars === 5)
.map(item= > item.email)
}
function send() {
return sendEmail(search())
}
Copy the code
With better control over query behavior and function reuse, we have less to worry about in a function.
4.11. Replace nested conditional expressions with Guard Clauses
Direct code
function getPayAmount() {
let result
if (isDead) {
// do sth and assign to result
} else {
if (isSeparated) {
// do sth and assign to result
} else {
if (isRetired) {
// do sth and assign to result
} else {
// do sth and assign to result}}}return result
}
Copy the code
When you read the function, you are glad that there is a comment between the if else and not the code. If it is a piece of code, you will be dazzled. What about the code below?
function getPayAmount() {
if (isDead) return deatAmount()
if (isSeparated) return serparateAmount()
if (isRetired) return retiredAmount()
return normalPayAmount()
}
Copy the code
The essence of the defense statement is to give particular attention to a branch. It tells the reader that this situation is not the core logic of the function, and if it does occur, it will do the necessary work and exit early.
I’m sure every programmer has heard that there can only be one way in and one way out of every function, but the “one way out” principle doesn’t seem to work here. In a refactoring world, keeping code clean is the key. If “single exit” makes your code easier to read, use it, otherwise don’t.
The essence of the defense statement is to give particular attention to a branch. It tells the reader that this situation is not the core logic of the function, and if it does occur, it will do the necessary work and exit early.
I’m sure every programmer has heard that there can only be one way in and one way out of every function, but the “one way out” principle doesn’t seem to work here. In a refactoring world, keeping code clean is the key. If “single exit” makes your code easier to read, use it, otherwise don’t.
5. When to start refactoring
5.1. Before adding new features
The best time to refactor is before adding new features.
Taking a look at your existing code base before you start adding new features, you’ll often find that your job is much easier with a little tweaking of the code structure. Let’s say you have a function that provides most of the functionality you need, but has several literal values that are different from what you need. If you don’t do refactoring, you have to copy the entire function and fine-tune it, which leads to duplicate code, which is the start of a code stink. So put on your refactoring hat, and once you’re done, it’s easy to develop your features.
This is ideal, but the reality is that there is always a time limit for scheduling tasks, and the extra time spent refactoring can cause you to lose control of your schedule and delay, so it’s not appropriate for a work situation.
5.2. After completion of new features or Code review
In combination with the scheduling of tasks and the actual work, the best time to refactor is after completing a feature and after a Code review.
Once a feature has been completed and tested, the progress of the task is manageable, the refactoring does not affect the functionality that the code already implements, and with a version control system like Git to manage it, it’s easy to fall back on the snippet of code that was available when the feature was available, but you can’t immediately complete a feature that you never implemented well.
Refactoring after each feature is completed is also similar to the idea of time sharding in garbage collection. You don’t have to wait until the code is full of “garbage” to clean up, causing a “complete pause” to occur. Break the refactoring into small steps.
Having a team, especially one that implements the same project together, validate their code can often lead to problems that are hard to notice. For example, a function written by myself has been implemented by another student, and it can be removed and reused completely. For example, experienced students come up with more elegant implementation solutions.
And write your own code, often with their own style and the “bad habits”, code style is not a mistake, but in a team, the mix of different code style brings difficulty reading and cooperation, and for the “bad habits”, such as extremely complex conditional statements, etc., it hard to realize that it is incorrect, need the opinions of the masses to correct them.
In fact, one of the important advantages of refactoring after each new feature I think is that you get a clearer understanding of your code, and you get to do things you’ll never do again.
It makes sense that being more clear about code allows us to better locate problems and improve our code.
So what is this thing that won’t be done again? That’s right, refactoring. If you don’t review a new feature immediately after it’s finished, it’s likely to be stuck somewhere until it bugs. Over time, the whole project becomes hard to maintain and the code starts to stink.
And after the completion of new functions, the workload is generally not very large, is “easy to complete the small work”, belongs to the stage of one, if you plan to look at it later, then often there is no later.
5.3. When it is difficult to add new features
This is not expected to happen; it means that the code structure is already in disarray and there are several hurdles to jump to add new functionality. At this point refactoring is a must and a big project, which can cause the project to “come to a complete halt.” Even worse is that refactoring may not be as good as rewriting, which is something we need to avoid.
6. When not to refactor
6.1. Rewriting is easier than refactoring
It goes without saying.
6.2. When there is no need to understand the code snippet
If a feature or API has been “working hard” for a long time and has never had a bug, we can live with it being ugly, even if it hides very ugly code. Don’t forget that one of the main reasons for refactoring is to make the code better understood, and when we don’t need to understand it, let it lie quietly. One of the principles of refactoring is not to allow uncontrollable behavior to occur.
6.3. Without consultation with the partner
If a feature is referenced by more than one module for which you are not responsible, you must notify the responsible person in advance that the feature will be changed, even if the refactoring does not result in any change in usage, because this means that the refactoring will be “out of control.”
7. Reconstruction and performance
The performance impact of refactoring is one of the most frequently mentioned issues. After all, refactoring often results in an increase in the number of lines of code running (not necessarily an increase in the total number of lines of code, since refactoring involves refining functions, and a good refactoring always results in a decrease in the total number of lines of code). Or turn some of the better performing code into more readable code at the expense of performance.
First of all, it’s important to recall that code refactoring and performance optimization are two different concepts. Refactoring is only about comprehensibility and extensibility, not efficiency. It’s important not to wear both hats when refactoring.
The impact of refactoring on performance is probably not as high as you might think, and in most business situations, the difference in performance between pre-refactoring and post-refactoring code is almost impossible to see.
In most cases, we don’t need to “squeeze” the computer to the extreme to reduce the tiny computer clock cycle time we use, and more importantly, reduce the time we use in development.
If you are not satisfied with the performance after refactoring, you can optimize some of the time-consuming functions after refactoring. The interesting thing is that most programs spend more than half their time running on a small portion of code, and optimizing that portion of code can yield significant performance gains. If you optimize all the code equally, you will find that you are wasting your time, because the optimized code will not be executed very often.
So I think you don’t have to worry too much about performance when you refactor, and you can refactor and optimize for individual snippets if necessary. In the short term, refactoring may slow down the software, but it also makes performance tuning easier, and it pays off in the end.
8. Finish sprinkling
The author is not a “refactoring master”, and this article only shows some very common refactoring techniques and brief thoughts on refactoring. There are also many classic techniques and cases that are not shown in this article. If you are interested in refactoring and want to learn more about it, you can read Martin Fowler’s classic book refactoring, Improving the Design of Existing Code edition 2, which uses JavaScript as the sample language, is a boon for front-end engineers.
For VSCode users, there are a number of excellent plug-ins to help you refactor, such as JavaScript Booster or Stepsize, which can show you how to refactor and bookmark and report your code.
Now that you’ve read this, you know what to do. Commit a feature, review and refactor.
9. Reference
[0] Refactoring to Improve the Design of Existing Code, Second edition by Martin Fowler
[1] 24 Common bad smells and refactorings in code
[2] six useful front-end refactoring plug-ins in vscode