Recently, an email entitled “Please don’t waste maintainers’ time on your KPI grabbing patches (AKA, don’t be a KPI jerk)” appeared on the Linux mailing list. In an email, Linux kernel maintainer Qu Wenruo pointed out that the patches submitted by Huawei developers were suspected of being KPIs, which caused hot discussions in the community. For details, see “Linux kernel maintainer criticize Huawei developers for KPIs”.

Leizhen, who is a contributor to Huawei’s Linux kernel, responded on the mailing list. He mentioned in the email that his main contribution to the kernel in the past was to optimize the performance of ARM64 SMMU drivers, including IOVA optimization, strict mode optimization, and lazy mode optimization. He also participated in the development of ARM SoC drivers. When time and energy permit, he also contributes to other modules of the Linux kernel, trying to find areas for improvement, during which he does some “cleanup” work. Finally, Leizhen says he will continue to make an increasingly important contribution to the Linux community in the future.

Qu Wenruo, the publisher of the original email, also responded to Leizhen quickly. He acknowledges Leizhen’s important contributions to the Linux kernel in the past, and says he’s happy with the “clean up” effort — not trivial, but just merge these small fixes into a bigger patch and submit it, because Maintainer has a lot of work to do. Don’t let them waste their time on these unimportant problems. Finally, Qu lists some of the contributions that are still to be made that are of interest to the Linux kernel community.

According to the development statistics of Linux Kernel 5.10 released in December 2020, Huawei ranked first in the number of patches submitted to Linux Kernel 5.10 and second in the number of modified lines of code, second only to Intel.

So is Huawei’s contribution to the Linux kernel a KPI or a real one? How to objectively evaluate Huawei’s contribution to the Linux Kernel?

How to objectively evaluate Huawei’s contribution to the Linux Kernel?

When measured by the number of submissions, Huawei’s contribution is second only to that of individual developers (gmail.com and kernel.org), and ranks first among global tech companies.

If we want to get rid of emails that point to “brushing KPIs,” there are two other ways to rank — by development equivalent (a measure of code logic complexity) or by impact, which leads to new results.

If we calculate according to the development equivalent ELOC, analyze the workload at the code level through the program analysis, and screen out the noise such as blank lines and dead code, then Huawei’s ranking will drop to around 10, as shown in the following table. The top three tech companies are Intel, AMD and Nvidia.

If we take into account the dependencies between the code, the order of combined influence is as follows:

As we can see, Huawei is still around the top 10, similar to Google, Microsoft and others.

Huawei’s contribution ranking is different from the shallow indicator like the number of submissions to the deep indicator like ELOC or IMPACT. It can be seen that application analysis can provide fresh perspectives and information for the measurement of contribution. A developer’s contribution to a project can be evaluated not only by the number of submissions, but also by ELOC and IMPACT. They better reflect the developer’s contribution to the project and reduce the impact of different development habits on measurement accuracy.

Of course, even by more scientific measures, Huawei still ranks among the top 10 technology companies in the world, making the most contributions from domestic companies. From this perspective, although we encourage core contributions, we cannot assume that Huawei’s “first” is the result of KPI.

Finally, we hope to tell beginners or just participate in open source friends “not to small and not for”, beginners do not have to care too much about whether the contribution is enough core, start to contribute more than everything, even if just modify a few typos is worth thumb up.

Interpretation of Related Nouns

About ELOC (Development Equivalent)

Development equivalent is a reasonable quantification and measurement of a programmer’s code output. Compared with shallow statistical indicators such as lines of code and submissions, development equivalent has two advantages: first, it is not easily interfered by programming habits or specific code behaviors (such as line breaks, blank lines, comments, parentheses, etc.), and it can better reflect the logical quantities involved in code development. Specifically, the development equivalent calculates the complexity of the abstract syntax tree. We can calculate both absolute and cumulative development equivalents. Software development is a dynamic process, where the code changes as it is committed and the corresponding abstract syntax tree evolves. The absolute value of development equivalent can be understood as a calculation of the abstract syntax tree of the code on a commit facet, taking into account the height of the abstract syntax tree, the number of nodes, the weight of different nodes, and so on. The absolute value of development equivalent fluctuates up and down over the course of the development process, usually in a pattern of “keep going up — a little down” that repeats over and over again. The development equivalent cumulative value, which is a calculation of code changes before and after each commit, is based on the minimum edit distance between the abstract syntax tree before and after each commit, where code cuts are also considered contributions, but given significantly less weight than code increases. The cumulative value of development equivalent is a monotonically increasing variable that reflects the output and progress of a team or project.

About IMPACT (Development Value)

Development value is a composite index that combines development equivalent with the relationship between code invocation. To facilitate understanding, we calculate the index in the form of percentage, which can be intuitively understood as contribution ratio. Among them, the call relation reflects the interdependent relation between codes, including function call, class inheritance, interface call and so on. The code is not a linear representation, but a graph based on dependencies. The more code is directly or indirectly dependent on a piece of code, the greater the impact of that code. It also means that if the code is changed, the scope of regression testing will be larger. In general, this type of code change is more costly and important.


Related reading:

  • Visit https://ranking.merico.build/oss-orgs/ to view the complete list can be, behind the data and analysis tool from the project at https://github.com/merico-dev/build
  • Linux kernel maintainers accuse Huawei developers of using KPIs
  • Zhihu: Huawei Linux kernel contributors are questioned about the KPI, what is the real situation? What information is worth paying attention to?