How to read source code efficiently

“Why read the source code when I can proficiently use the framework/software/technology?”

“Usually do not have to see the source code, look at the source code is too time-consuming, but also easy to forget, problems in the work and then targeted to read, more efficient.”

“Only need to read the source code for the interview!”

If you have similar questions, keep reading

1. Why read source code?

1.1 Improve technical ability in general basic technology

In the Field of JAVA, including JAVA collection, JAVA concurrency (JUC), they are the high-frequency technology used in the project, in a variety of complex scenarios, the selection of appropriate data structure, thread concurrency model, reasonable control of lock granularity can significantly improve the availability and robustness of the application, it is very easy to highlight their own technical strength. They are more likely to be recognized by their leaders and help their career.

Of course, reading the source code is not the only way to know how it works, but as a programmer, face to face with the code, personally feel the charm of the code may seem more direct.

1.2 Create your own highlights in key areas

My company uses Dubbo and RocketMQ, and I am also fortunate to participate in the application, operation and maintenance of these technology stacks, and have accumulated rich experience in using them. In order to highlight my advantages in these two fields, I have read their source codes in detail and published a large number of technical articles on knowledge sharing platforms such as CSDN and official account. Systematic analysis of its implementation principle, architecture design concept, the combination of theory and practice, let me become a technical expert in Dubbo, RocketMQ field, the core of the team.

At the same time, because the articles are systematic, I was invited by the publishing house to publish a book, RocketMQ Technology Inside came into being, which became a very prominent business card in my professional skills list, formed a recognized technical influence, and has a certain “brand premium” ability.

Of course, can also wait until the problem to see the source code, “input output ratio” is higher, but this is a passive process, if the production environment due to concurrency is not high, it may be a year, two years you can not meet the real problem, experience accumulation is very slow, work for 4 or 5 years, it is possible to work than 2 or 3 years of people.

So if you want to quickly create highlights, or need to take the initiative to read the source code, into the system to master its design concept, implementation principle.

1.3 Learn design and coding from good source code

Learning programming process is actually a process of imitation, excellent source code are master works, very nutritious, you can see how masters are abstract interface, how to apply SOLID principles, and a lot of very useful programming skills.

For example, JUnit builds systems from patterns, where you can see a lot of design patterns in action, which is much better than looking at design pattern theory or simple examples.

2. How to read source code

Based on years of reading experience, I have developed a set of methods:

Understand the usage scenarios of the software and the responsibilities of the architectural design.
Look for official documentation to get an overview of the design philosophy of the software.
Set up their own development and debugging environment, run the official Demo example, to lay a foundation for further research.
First main flow and then branch flow, pay attention to cutting, break one by one.

Let me share some of my experiences reading the RocketMQ source code to make the above theory as graphic as possible.

2.1 Understand the application scenarios of RocketMQ

The usage scenario for MQ is fairly clear, with its two primary responsibilities being decoupling and peak-cutting.

Take the simplest scenario: new users register and send points and coupons. The original architecture is usually as follows:

As you can see the user registration and coupons, award points are tightly coupled, as business development, activity department proposed user registration during the Spring Festival award points, not hair coupons, but a New Year gift bag, if based on the above framework, need to change the main process of user registration, violates the design pattern for modification of the closed and open for extension of the design concept.

The emergence of MQ can be a good solution to the above problems:

With the introduction of MQ, the user registration main process only needs to complete the registration logic and send a message to MQ, and the activity module (send points, send coupons, send gift packages) only needs to subscribe to the message in MQ and process it accordingly.

This makes the message registration main process very simple and does not change regardless of the type of activity, thus achieving decoupling.

2.2 Read through the official document and grasp its design concept from the overall perspective

After understanding the usage scenarios, we can go to the official documents, mainly including user design documents (architecture design), user manuals, etc., to understand the design concept from the whole situation.

By reading through the official document, not only can be concluded that the MQ overall context (for example NameServer routing discovery, messages, message storage, consumption, message filtering), also can order to consumption, zero copy, synchronous asynchronous brush brush plate, plate “high-end grade” atmosphere of advanced features such as interest and curiosity, drive us to read the source code, Exploring its implementation details makes it possible for us to have a certain self-reflection in reading the source code.

2.3 Setting up a development and debugging environment

The RocketMQ and IDEA Debug environment setup will be implemented in different ways. Here is a tutorial on how to install RocketMQ and IDEA Debug.

2.4 First trunk, then branch

After setting up the local development environment, do not directly use Debug to track the overall process of sending messages, because the process is too long. From a coarse-grained point of view, the process is shown as follows:

If you want to read all the source code of the above process at once, it is almost impossible. Because of the high availability of messaging design, message storage, flushing, synchronization, etc., the amount of work that goes into detail at each point is enormous, and we don’t have that much continuous time, so proper splitting is necessary.

After such a decomposition, we can focus on understanding the design principle of a certain part of it, and the continuous time required can be greatly reduced. One bite at a time, we can finally complete the understanding of the whole system.

This has the added benefit of Posting multiple articles in batches, which increases your output and sense of accomplishment.

3, reading the source code is easy to give up, how to do?

Reading source code is boring, a person fighting alone is easy to give up, especially when encountering problems, how to stick to it, read it?

My answer is to persevere, but allow brief pauses, pauses, and repeated sprints until the “hill” is conquered.

Because once give up will be wasted, once the breakthrough, their ability to get a qualitative leap.

I encountered difficulties in reading Netty source, had just finished the Netty memory leak detection, ready to start study memory allocation mechanism, which together very abstract, involving complex data structures, need to know how between the binary tree and array mapping, involving a large number of calculations, etc., let me in to explore its principle of work to do? Give up?

When you don’t get a breakthrough for a week or two, it’s easy to wonder: Is it a waste of time to keep putting in time with no return?

In fact, I thought about giving up, but then I thought: after giving up, what will I do? Play a game? Watching TV?

Since it is playing games and watching TV, isn’t it a waste of time? After thinking clearly about this layer, continue to attack, continue to break through became the only option.

Of course, it is necessary to relax for a day or two once in a while and readjust your state. Taking a short break can make us feel less anxious and allow us to regroup and start sprinting again.

When faced with a problem, what can we do to help us overcome it?

1. Turn to Baidu for help

When reading the source code can not understand the code directly COPY a small section to Baidu to search, there may be daniu has done interpretation of these codes, can play a role in guiding.

Through my search at that time, I found that Netty’s memory allocation algorithm is not the first one, but the realization of Jemalloc algorithm. By referring to relevant technical documents, WE can understand Netty’s memory allocation algorithm as a whole.

2. Debug Netty strives for extreme performance and uses a large number of bit operations. It is difficult to use bit operations in normal work.

After the above efforts, it took me 10 days to unravel the mechanism of Netty memory allocation. After breaking it down, I wrote a series of articles describing Netty memory allocation:

Over this hurdle, the source code behind the reading efficiency becomes very efficient.

By solving one problem after another, eventually from quantitative change caused qualitative change, gradually formed a own source code reading theory, the subsequent source analysis RocketMQ, Dubbo, ElasticJob, Sentinel, Kafka and other columns became very simple.

Source code reading three levels of realm

4.1 Elementary: Keep a running account

My initial source code reading article is basically a running account, such as the source code of the same line with annotations, only pay attention to the underlying implementation details, but did not form a higher level of cognition, the design concept is not refined and deep understanding.

4.2 Intermediate: can ask questions, think and refine

With the continuous sharing of technical articles, I got to know a lot of great people. When I communicated with them, I found that they would not talk about details at the beginning, but about design concepts.

This requires us to think about the source code as we read it, and ask ourselves how we would go about it, how we would design it, and study the source code with questions. Through comparison and reflection, we can have a deeper understanding of the concept behind it.

4.3 Advanced: Thinking, questioning, and verifying

Which open source framework, whether there are bugs or implementation is not reasonable place, if everyone can think deeply about, when reading the source code is reasonable, and can be verified to prove their point of view, and then get in touch with the official, to communicate and work together to promote the development of the community, shows our ability, thinking got great improvement.

Thinking and questioning is a distillation of source code reading, such as the one I questioned while watching Sentinel Fuse.

A “user information lookup” service is deployed on three machines.

The machine, 192.168.1.3, had a period of high load and long response times, with 30% of requests sent to it failing.

But 30% was the fuse error rate set by Sentinel, so Sentinel decided that the entire “user information lookup” service was not available, and the fuse was disconnected.

This is obviously unreasonable because 192.168.1.4 and 192.168.1.5 are still alive!

Essentially this is because the fuse error rate is defined at the service level: service -> fuse error rate

In thinking about this issue, I think that the configuration of circuit breaker rules needs to be refined, and the circuit breaker error rate should be defined at the resource group level: [Service, machine] -> circuit breaker error rate

In this way, when a service provider on one machine is disabled, it does not affect other machines, ensuring true high availability.

By contacting the authorities, my idea was confirmed by its author, which in turn promoted the development of the community.