After Coding introduced the “code force” feature, I was often asked, “Does code force really measure a programmer’s programming ability?” “What is the mechanism of code force evaluation?” “I’m a beginner in programming. If I upload some other people’s code, will my code power be higher?”

As the designer of a product called Code Force Value, TODAY I want to answer these questions.

How is the code force evaluated

Code force value is based on big data, using machine learning and other technologies developed, can objectively evaluate a programmer’s programming ability of a set of scoring system.

Strength evaluation code data base, from programmers code written in the past for a period of time, and some behavior characteristics in the process of writing code, based on mass data, we established a set of evaluation model by using machine learning algorithm, the model covers the vast majority of programmers coding ability related dimensions, such as: There are dozens of dimensions such as personal experience, code quality, coding productivity, and personal technical diversity. The evaluation model will evaluate the score of each code in each dimension one by one, and finally gradually accumulate the code force value.

In this group of evaluation models, several models are important, such as code quality model, behavior characteristic model and code force value model. The mechanisms of these models are briefly described below.

Code quality model of quality related original data, there are hundreds of relevant indicators all the code used to calculate the force value is obviously unreasonable, the code quality model’s aim is to adopt the method of dimension reduction analysis to select positively related to the code quality indicators, and then put the dimension reduction strength model after data input to the code.

In the model, random forest algorithm is first used to classify each index, and several indicators positively correlated with code quality are screened. Then, dimension reduction is performed through principal component analysis and other technologies. Finally, the output data is a group of data closely related to code quality.

Behavior characteristic model The goal of user characteristic behavior model is to extract a series of characteristic indicators by analyzing the behavior of user submitting code, and then identify the user’s work state, work intensity, delivery ability and other data related to personal experience and ability.

The implementation of the model is mainly based on the time series algorithm, and then based on the time series to extract feature indicators, such as kurtosis, margin, variance, and so on, and finally these feature indicators are input into the code force value model.

TensorFlow neural network algorithm and nonlinear regression algorithm are mainly used in code force value model.

If the code quality indicators, press 0 score, indicators and score there must be some kind of exponential relationship between, therefore, a strength model through nonlinear regression algorithm code, find the exponential relation, then put the upper reaches of the various input indexes and the trained neural network to get the corresponding weight coefficient weighted sum, Calculate the total score of code force value.

As for the mechanism of code force value evaluation model, some brief technical introduction is made above, but the actual research and development process is much more complex, and we have gone through some detours. Let’s take a look at some specific measurement dimensions in the code force model. You can also learn about “how to write good code” and “how to improve code force” from another side.

How to improve the code force value

The logical fulcrum of code force measurement is that a good programmer can always write good quality code. Therefore, as long as you keep producing good code, your code power will get higher and higher.

Of the measurement dimensions, those related to code quality have the highest weight, followed by those related to personal experience. Therefore, in order to achieve a high code force value, you must do two things: 1. Write high quality code. 2. Improve code experience through daily accumulation.

To improve the quality of your code, you should first look at the extensibility of your code.

scalability

Extensibility refers to code that is written in such a way that it is easy to modify or add new features in the future. Code that is extensible can be modified, removed, or added with minimal impact on the rest of the code. Code that doesn’t scale very well, and when you change or add functionality, it’s very painful, and you just want to add a few lines of code, but you end up changing more.

A good product is iterated through constant modification and refactoring. If our code does not have a good extensibility, it will lead to difficulties during the iteration. In fact, it’s not that hard to improve the extensibility of your code, as long as you take the extra step of writing your code and consider the possibility of future changes or additions, the extensibility of your code will gradually improve.

The complexity of the

The cyclic complexity of a program is the number of linearly independent paths is the complexity of the code. IF there is no control flow like IF instruction or FOR loop in the program, because there is only one path in the program, its cyclic complexity is 1. IF there is an IF instruction in the program, there will be two different paths, corresponding to the IF condition is true and the IF condition is not true, so the cyclic complexity is 2.

There is a high positive correlation between code complexity and the number of defects. The higher the complexity, the more defects there will be, and the more complex the project, the less extensible the code will become. One way to reduce complexity is modularity, which allows code modules to be loosely coupled to each other while ensuring their ability to be reused by other systems.

readability

Good code readability can not only improve the extensibility of the program, but also improve the work handover efficiency and communication efficiency of the whole team. Code readability should be the basic quality of a qualified programmer. Even if arrogance is one of the top three virtues of a programmer, it shouldn’t be reflected in not letting others understand your code.

There are many ways to improve the readability of code:

  1. Write clear code comments.
  2. Fully named, variable names, function names contain sufficient information to explain the purpose of the function.
  3. The code logic is clear and easy to understand.

Bad smell – Duplicate code

Repetitive code is the number one bad code smell. Programmers who like to copy and paste code should be aware that when you copy and paste code into several places, you create a ticking time bomb in the product. When you need to change the code in a production iteration, you may miss a few things and cause bugs.

To reduce duplication of code, you can simply abstract the code into a function or class and solve the problem.

Robustness,

Robustness refers to the code’s ability to tolerate faults and respond appropriately when exceptions occur. Fortunately, the average programmer pays more attention to the robustness of their programs, and as they become more experienced, the average programmer’s code becomes more and more robust.

Some other dimensions

Here are the answers to the following questions:

“I uploaded some other people’s code, do I get a high score?” “The code I submit has a lot of framework code. Will the benchmark model exclude that? “

The model also includes many time-related evaluation dimensions. From a long time axis, the behavioral characteristics of code writing can be found. The code force value first identifies the code written by oneself according to the behavioral characteristics data, and then starts to evaluate the code quality. Or the framework code will be identified and eliminated by the code force evaluation model.

Code force is still improving

According to user feedback, the accuracy of code force value has reached nearly 90%, but there is still room for optimization. Through analysis, we found the reasons for the deviation between code force value and the real level of individual users. These are also our next improvement direction:

  1. Add more measure dimensions At present, based on the static code quality evaluation dimensions is bigger, the proportion of the code quality index of the “run time” is not included in the code value evaluation dimensions, in this case, business logic errors in your code is not included in the code value evaluation process, which to a certain extent, make code is a strength, there will be some deviation in the next step, We will add some “runtime” data dimensions to make the code value more comprehensive, such as test case failure rate, code coverage, and so on.

  2. Some users do not use Coding as the main code hosting product. Currently, the evaluation of code strength mainly comes from the code data uploaded to Coding by users. If the amount of code is small, it cannot support the comprehensive judgment of a person’s real skills. Using Coding as a code repository for your daily work, code strength is a comprehensive reflection of your Coding capabilities. Uploading a lot of code at once is not conducive to the growth of code power, but only the cumulative use of Coding can help improve code power. At the same time, we also plan to take some technical steps to introduce data from third-party code repositories such as Github and Gitlab, so that the data coverage of code force measurement will be more comprehensive.

  3. Support for more programming languages Currently only supports four programming languages, Java, PHP, Python, and JavaScript. In the near future, code will support more and more languages, including objective-C, C++, C#, Go, Ruby, and more

In the future

With the accumulation of more and more rich data, code value will gradually become a professional ruler of the programmer industry, just like Sesame credit, to measure the comprehensive professional quality of each programmer. Then, China’s software industry will show more and more high-quality products. Code force value growth depends on the accumulation of bit by bit, from today to develop good programming habits, improve code force value, become a code master!

If you have any problems in practice, you can access the Coding help documentation.