With the advent of the era of booming open source, in order to reduce the development cost, improve the efficiency of development, more and more companies use a variety of open source projects, as a developer, if we can make full use of resources in good open source project, not only can improve the practice ability, professional knowledge level, also learn from the excellent architectural idea.

This article will provide some ideas to learn open source projects, I believe that after reading this article, xiao Bai can also learn to read open source projects, do not have to be daunting to the lofty open source projects, shallow tasted.

1 the value of learning

To sum up, the value of learning open source projects mainly includes the following:

  • Improvement of professional level A lot of general professional knowledge can be used by any company in the professional field, especially the underlying knowledge can be learned in open source projects, such as multithreading, network communication, operating system processing, etc. For example, by learning the Redis RDB persistence mode, “the database snapshot in the current memory will be saved to the disk file”, you can learn that in fact, the operating system fork a child process to achieve, and then further, it involves the parent-child process mechanism, copy-on-write technology.

These expertise can be connected and grow like a tree, but without a thorough understanding, there is no connection and it cannot grow on its own. As we understand the key points of open source projects, the knowledge grows and snowballs.

  • Improved problem solving skills

By learning the implementation of open source projects, when online problems occur, you can quickly locate the crux of the problem and solve the problem by modifying the configuration or modifying the source code. Or when the business needs are not met by a suitable open source project, the existing open source project can be adapted to meet the business needs. As to good development, into the passive situation of “API operator”, an important goal of learning the open source project is to know is how to implement and optimize its function point, learning the knowledge as the formula derivation process, master the basic API will use is like a mathematical formula can cope with the exam, but understand good reasoning according to help memory and understanding, We also need to know why, and we also know how to solve those situations where we can’t apply a formula.

  • Improvement of thinking

By learning the excellent architecture of mature open source projects, we can summarize and understand some common architectural ideas of software design, such as the realization of high availability, mainly through data redundancy of clusters, such as Kafka cluster, HDSF cluster; To achieve extensibility, consider separating the changing layer from the immutable layer and implementing business abstractions, such as some of Spring’s reserved extensible interfaces.

2. Common misconceptions

There are some common misconceptions about learning about open source projects that cause novices to give up or waste a lot of time with little to gain:

Learning about open source projects is the job of an architect, a technical genius, and IT was hard for me as a beginner to learn it, and even if I did learn it, I didn't need it.Copy the code

Learning is a process, not overnight can become a Daniel, but as long as you take the first step, there is always a possibility to realize the dream of Daniel; On the other hand, I believe I can gain something and improve my ability through continuous review and summary and proper methodology guidance. I believe there will be great improvement in logical thinking and knowledge system construction after learning. Even if the project does not involve a specific open source project, it will also be a progress if relevant problems come up in the future.

Data structures and algorithms are important, and I only need to learn two aspects of the projectCopy the code

Don’t just focus on data structures and algorithms, which are not that important in open source projects. For example, Netty’s timeout queue is based on red-black trees, and that’s all we need to know, unless we need to change the functionality. It is more important to understand the design of the system and the implementation of the function.

Dive into the source codeCopy the code

Many newbies fall for the popular phrase “Talk is cheap, show me the code” on community forums and dive into the source code, only to get lost in the shuffle of code functions.

Actually learning the open source project should be from the top down, the bottom of the source code should be the last to start learning, before that, the need to learn project related architecture design knowledge, with this knowledge, like database index, index according to the knowledge to source targeted breakthrough, such as cruise missiles accurate blasting, Nature is more effective than carpet bombing.

3 The four levels of learning

According to the depth of learning and understanding, learning can be divided into four levels

  • Basic learning has a general and basic understanding of the project, such as what the project is, what effect it has, how to use it and what problems it solves. In the interview, many people who have just entered the workplace of the resume to use a lot of technical framework, in fact, often only to this level, further down the question, they falter and falter.

  • Review learning to have a systematic understanding of the project, all aspects of the system functions, basic principles, advantages and disadvantages, use scenarios, configuration items, API use. In practice, as an ordinary member of a team, reaching this level is enough to meet basic business development needs, but it is not enough to pursue higher technology.

  • On the basis of reviewing the study, I have a comprehensive understanding and practical experience of various performance parameters of open source projects and performance tuning of their own scenarios. At this level, in project production, I have the ability to take charge of my own work, and I have the ability to take on the role of core development.

  • Subject learning on the basis of the analysis of learning, the source code of the key function modules of open source projects have some understanding, can according to the actual needs of encapsulation, modify the source code, or draw lessons from the project to create a new wheel. At this level, they are often capable of assuming the role of technical leader and technical leader.

4 The 4 Steps of learning

With regard to the levels of learning mentioned above, here is how to learn “from the top down” to achieve these four levels.

4.1 Basic understanding learning

The goal is to reach the basic learning level and have a general understanding of the project, including the project background, problem solving scenarios, project functions, application scenarios, and basic API usage. Learn from official documents, blogs, and videos.

After having a general understanding of the system, there will naturally be some questions, such as the principle of implementation, advantages and disadvantages, etc., and subsequent learning with these questions will be more efficient.

4.2 Systematic learning and practice

The goal is to achieve the level of inspection learning, to have a systematic and comprehensive understanding of the project, including the function of the project, component modules, basic principles, use scenarios, configuration items, API use, and comparison of the advantages and disadvantages with other similar projects.

Methods and steps are as follows:

  • 1 Installation And Operation Install and run the project according to related documents. In this process, attention should be paid to:

    • For example, Memcached’s most important dependency is the high-performance network library Libevent. We can infer that Memcached’s network implementation is based on the Reactor model.
    • Common installation directories are conf to store configuration files, logs to store log files, bin to store log files, and some special directories for different projects, such as Nginx has HTML directory, this directory can encourage us to continue to study with relevant questions, with questions to learn the highest efficiency.
    • The tools provided by the system need to pay special attention to the command line and configuration files, which provide two very important key pieces of information about what the system is capable of and how the system will behave. This information is an observation window through which we can learn the internal mechanisms and principles of the system. In general, if you have a basic understanding of the functions and principles of each command line parameter and configuration item, you will be familiar with the system. In practice, you can keep trying to modify configuration items and see what happens to the system.
  • 2. It is very important to systematically study the principles and characteristics of the technology, because only by clearly mastering the principles and characteristics of the technology can we truly master the technology and make reasonable choices when designing the architecture. In this process, we need to focus on the following:

    • Key features are important selling points of the open source project, such as high performance, high availability, and scalability. How the project achieves this is what we need to focus on.

    • The advantages and disadvantages are analyzed mainly by comparison. That is, we compare two similar systems to see their implementation differences and what are the advantages and disadvantages of different implementations. Typical comparisons are Memcached versus Redis, Kafka versus ActiveMQ, and RocketMQ.

    • Application Scenario Which scenarios are applicable to the project and which scenarios are not applicable to the project, and common cases are applicable to the industry.

At this stage, we can learn official technical design documents, architecture diagrams, schematic diagrams, or relevant technical blogs. Usually popular open source projects have a lot of analysis documents, so we can avoid repeated investment on the basis of predecessors. However, it should be noted that due to differences in experience, level, focus, version used and other differences, the conclusions of different people’s analysis may be different or even wrong, so it cannot be completely referenced. A good way to do this is to cross-reference, which means looking at multiple analysis documents and comparing their similarities and differences.

At the same time, if some technical points are difficult to find information, and you are not sure, you can write Example to verify, through log printing, debugging, monitoring tools to observe and understand the specific details. For example, you can write a simple program using Netty to understand the implementation by observing network packets using a packet capture tool.

4.3 System Test

If you are studying and researching on your own, you can refer to the online documentation for testing and analysis, but you must test for use in a production environment. Because the test results searched on the Internet do not necessarily fit with their own business scenarios, if they simply refer to the test results of others, it is likely to draw wrong conclusions, or different versions of the test results are relatively different.

Pay special attention to, the tests must be based on the open source projects have a systemic understanding, on the basis of not installed test immediately, otherwise it may because of wrong configuration items, using method is undeserved, cause not according to the characteristics of the business set up the right environment, there is no reasonable design test cases, so as to make the final test results to the wrong conclusion, Misguided design decisions.

The following provides common ideas for testing. You need to design test cases based on specific projects and services.

  • Check the functions and impacts of each configuration item and identify key configuration items
  • Perform performance tests for multiple scenarios
  • Run for several days to observe the fluctuation of CPU, memory, disk IO and other indicators
  • Fault test: kill, power off, remove network cable, restart more than 100 times, switch over, etc

4.4 Key source code learning

Delve into and understand various design ideas and code implementation details of the project, the basic positioning is “proficient”, excellence, endless learning. This is the realm of the great gods. If you want to be a significant contributor to the team’s technical responsibilities and project community, you should aim at this level.

Code is not only read, but also read and try. Some people have not even called the API, so they look at the code, thinking that they save time, but actually they are self-destructive.

The key to analyzing and testing the source code is as follows:

  • 1 gets the call stack in the IDE and reads it in the IDE. It’s easy to jump around and see definitions in the IDE, which is much more efficient than on the web. Through the IDE tool, run the Example program to trace debugging, through the break point to get the program to run the call stack. Compile and debug where possible. There is almost no code that can be debugged that is unreadable.

  • 2 draw down the call stack after sorting out the call logic of the code, draw the code through the drawing tool, which can be drawn: flow chart, class diagram, call diagram, sequence diagram, and select the most expressive diagram in the actual situation.

In addition, learn more about design patterns. So when you see proxy, Builder,factory in the name, you get it. Horizontal layer, longitudinal block. The code is modular, some of it is Core, some of it is Util, Parser, stuff like that, so you know what layer you’re looking at and what piece you’re looking at.

Some small projects are not clearly layered and do not have to be forced. What we should look at is not only the grammatical skills, but also the ideas and principles of design. Read not to understand, the simplest standard is, if given enough time, have the confidence to write a similar thing.

5 Steps Summary

In actual practice, complete the above five steps to spend time is long, as is often the case, the previous two steps, when necessary, on the study of the open source project in the third step can work plan to use open source project implementation, the fourth step in a flexible time under a certain amount of time and energy to do.

Rather than each project to briefly understand, it is better to focus on a project research thoroughly, even if only half a year to understand one, the accumulation of a few years after the number is still considerable. Moreover, the ideas of many projects are common, such as high availability scheme and distributed protocol, etc. If you thoroughly study one project and then study similar projects, you will find that the learning speed is very fast, because you have mastered the common parts and only need to study the different parts of the new project.

At the same time, in the process of learning, it is necessary to constantly summarize, review and output study notes, on the one hand to exercise logical thinking ability, on the other hand, it is conducive to the establishment of knowledge index, after a period of time when the review through the index can quickly regain knowledge, not easy to forget.

6 Recommended open source projects for beginners

After introducing the theory, the following is the need to test through practice, the following introduces several common server-side development than novice friendly, and more information on the open source project reference:

  • Spring

    As the most popular framework in the industry, Spring’s importance is self-evident. It should be noted that due to the large Spring ecosystem and limited energy, beginners are advised to start with the simplest modules, such as Spring JDBC Template, Spring IOC, Spring AOP, and Spring MVC

  • Mybatis

MyBatis as the industry popular excellent persistence layer framework, support ordinary SQL query, stored procedures and advanced mapping, code volume is not large, online related source code analysis data is also more, the project code quality is relatively high, worth reading.

  • Elastic-Job

    As an open source distributed task scheduling solution of Dangdang, Elastice-Job is highly popular in the community. You can learn about distributed communication and scheduling.

  • Dubbo

Dubbo is a high-performance service governance framework developed by Alibaba, which enables applications to realize the output and input functions of services through high-performance RPC. Dubbo restarted maintenance at the end of 17 years, widely used in business, read and understand the source code, in service governance, the technical strength of distributed protocol believe that there will be a qualitative leap.

More exciting, welcome to pay attention to the author’s public account [Distributed System Architecture]

reference

Learning architecture from scratch — Alibaba’s Li Yunhua

How to learn new technology efficiently

Learn some tips for open source projects

How do I read the source code for open source projects