As a programmer, you often need to read the source code for open source projects. At the same time, reading source code has many advantages for us:

1. Improve yourself

Reading excellent code can first improve our own coding level, second can expand our thinking of writing code, and third may let us get the offer from the big factory. In either case, good code is what brings us to the next level of development, and it’s not easy to read and understand good code.

2. Fix the Bug

Sometimes, some of the open source components we use have unexpected problems. At this point, there is no previous experience to refer to, there is no documentation to refer to, you have to rely on their own repair. Read the code, understand the project, and fix the problem. If you don’t read the code well enough, fixing the bugs will become a thorny issue, affecting our work.

3. Add new features

At work, we’ll find ourselves scouring open source libraries and not having a particularly suitable component. This leaves existing components to be modified, and the prerequisite for this transformation is an understanding of open source components. At this point, we can only read the code.

There are many benefits to reading source code, but reading source code itself is not a simple task.

On the contrary, it is a very difficult thing to do. In general, reading code is difficult when:

  • The code is boring to read, and after a while you get sleepy and confused.
  • I read the code for a long time, only to find that I don’t know what I’ve got. I’ve spent time learning nothing, and I’m lonely.
  • For an open source component, it takes days to read the code, and only then to understand the code in one or two files, resulting in a delay in working properly.

I’ve been working with open source components for a long time and have been forced to read a lot of code. After experiencing the above difficulties, it took several years for me to summarize a set of my own playing methods, and then I could really quickly understand and read through many open source components.

A number of readers have asked me how to read code, so I’ve decided to write down some of the tricks I’ve summarized, hoping to give you some help and improve yourself faster.

So let’s look at some of the ways I personally read code.

Overview of the big picture

Before reading the code, we need to take a look at the source code from God’s perspective in order to get the full picture of the component.

The full picture includes:

1. Main uses of open source projects

We need to know what the project is mainly for, because that is the ultimate goal of the project.

The source code for all open source projects is written with this ultimate goal in mind.

For example, in the case of logback, its purpose is to log. The ultimate goal of all of its code, no matter how complex, is to make LogBack robust and efficient at printing logs.

2. Project architecture

The value of understanding the project architecture is that by understanding the hierarchy of the system, you can work out the core context of the project. With a core context, we can spend our limited time reading the most valuable code.

If the project’s official documentation has an architecture diagram, use the official architecture diagram to understand the overall architecture of the project. If there is no architecture diagram in the document, check to see if there is a folk god to draw it. If not, you can draw your own architecture diagram according to the description of the official document.

Take Logback as an example. Since the official architecture diagram was not provided, I drew a rough architecture diagram according to the document.

Two, play without boredom

Having figured out the core of the system, we need to get the project up and running.

Running projects serves two purposes:

1. Know what prerequisites the project must have before it can run

So let’s go back to the logback example.

When we can successfully run logback, there must be a logback.xml file, otherwise we cannot run it.

The logback.xml file is actually very important for us to look at the source code, it points out the key elements needed for logback.

Also, if reading the source code is confusing, understanding this configuration file can help us get over the hurdle. Later on how to read the source code in detail.

Here is a basic Logback configuration that lists the key components needed to run logBack.

2. If you are confused when reading code, you can solve your confusion through debugging

The open source projects we read about tend to be complex. There are three typical cases:

  • Method variables don’t know what they mean
  • The logic jumps around and around
  • Encapsulating objects is too deep

The above situation can only be solved through code debugging.

Third, pull the silk from the cocoon

You know the big picture, the core context, the project is up and running, you think, now I’m going to read the code?

No, you still need to refine your goals.

As I said earlier, we read source code for three purposes:

  1. Improve yourself
  2. Fix the bug
  3. Add new features

But these aims are too vague. To improve yourself, what code can you read to improve yourself? Fix bugs. What code do YOU read to fix bugs? Add new features. What code do you read to add new features?

So, you have to pick out the code that works. How to choose?

One of the things we hear a lot about when we’re in development is breaking things down: breaking big problems into smaller ones.

The same goes for selecting and reading working code.

For too much code, too much functionality, one of the most important things we need to do is to break down vague goals into small, precise goals that can be implemented. These small goals correspond to the project, which is actually a business process of the project.

For example, we want to add a new feature to LogBack that allows companies to print their logs in a uniform, fixed format. See how we do it:

1. Vertical decomposition

Vertical decomposition is the decomposition of vertical business processes on a given architectural diagram.

Since we wanted to unify the company’s log format, we definitely needed to format the log content before printing it to a file. Therefore, the business process should choose a business process that starts with the application log calling logBack to print the log and continues until the log content is output to the destination file.

2. Scale horizontally

Scaling laterally determines how we combine business processes so that we can achieve the overall goals we set out to achieve.

For example, you can set up a process to see how logBack logs are switched after you see how logback logs are printed.

Four, the dragon into the sea

Okay, now we’re finally going to look at the code.

But it’s a tricky thing to look at code, not just look at it.

Please light up my heart

First, we had to refine the goals and draw out a complete business process. With this in place, we can map the business process to the code logic.

Take a look at logback:

The door is as deep as the sea

With the business relationship mapped, we can start reading the code. When reading code, we need to master a few other techniques:

Tip # 1: Code must skip

One thing to understand is that not all code is worth looking at closely. Our first priority is the core code that looks at the forward flow, the rest of the code can be skipped.

Examples of code to skip are:

  • Judge abnormal input code – this kind of code is not significant to our understanding of the system, and when we want to improve their coding ability, we can go back to find some excellent code set to learn.

  • Error handling and exception state handling code — same reason above.

  • Data processing code – usually parses input data, wraps output data, and sometimes passes data using Dtos or DAOs. Some of these code is very complex, but also very long, after reading, energy, confusion, often do not help grasp the principle of the project, be sure to skip.

  • Low-level interaction code – To be honest, low-level interaction is very technical and requires a lot of low-level knowledge. For a while also cannot make up for, and once read not to understand, the confidence blow is very big, suggest to skip.

Tip 2: Call relationships need to be determined

When we look at code, there are some ways that we can read code.

If you read the code and find that you can’t find the subsequent flow, consider whether the author is calling subsequent methods or objects in a non-sequential manner.

In general, developers make non-sequential calls in the following ways:

  • Continue subsequent processes through middleware, such as MQ
  • Continue the process asynchronously, such as Future mode, Promises mode
  • The process continues through callbacks
  • Continue the subsequent process through proxy delegation, such as dynamic proxy
  • Continue with subsequent processes through dependency injection, such as Spring’s AutoWired annotations

These out-of-order calls can seriously affect how we read the code. For these cases, there are probably two solutions:

  • Direct guess – in fact, the follow-up process we do business process mapping to the actual code object has probably known, if it is an interface, we see the implementation of the class is not many, you can roughly look at, generally can guess which is which.
  • Run and debug – this method is very common, for any uncertain anything, in fact, can use this method.

Tip 3: Super hard algorithms last

For some open source projects, it uses a lot of classic algorithms. It’s a classic, but it’s hard.

However, these algorithms can seriously hinder our progress in understanding the overall project. I would suggest that these algorithms, you write down the position first. In the following episodes, I will slowly understand the algorithm data.

Above is the logback log file splitting algorithm. It is not recommended to understand the algorithm immediately when understanding the business process. You can set another goal to understand it later.

That’s the code reading routine I’ve been following for years.

Conclusion under

First, before reading the code, we should take a look at the big picture of the project. We can speed up our understanding of the big picture through official documents, private blogs and the like. It is a good idea to refer to some architectural diagrams, sequence diagrams, or draw some of your own if you don’t have one already.

Then we get the project up and running, which helps us debug and quickly understand the difficult code sections.

And then, refine our goals into business processes. Without these business processes, we read huge chunks of code at once, with no clear context and no reachable mission objectives… The result was chaos.

Finally, start to actually read the code. While reading code, we should have skills to read, to know how to skip some code, to know how to skillfully find the follow-up call flow, but also know how to focus on some difficulties to overcome.

This is how I read code quickly and deeply.

Another benefit of reading more code is that when I’m designing a project architecture and writing a framework, I can’t help but see similar projects or code blocks popping up.

In a word, I benefit a lot from reading code, which is also one of the important reasons for my successful career, and I hope to help others.

If you think this article is helpful to you, please help to give a thumbs-up, a small thumbs-up can also be regarded as my original support.