Algorithm is the step to solve the problem. The same problem can have multiple solutions, so there will be a variety of algorithms, but there are good and bad algorithms, the distinguishing mark is complexity.

Complexity can be used to estimate performance, such as time and memory, so we use complexity to evaluate the algorithm.

(If you are not familiar with complexity, see this article: Profiler is not necessary for performance analysis, complexity analysis can also be used.)

When developing, we use the most simple idea in most scenarios, that is, the algorithm with high complexity does not have any problem, it seems that there is no need to use a variety of high-class algorithm, algorithm this thing seems to be learning but not learning.

No, that’s because you don’t have some data-heavy scenarios.

Let me give you an example of a specific scenario from my previous company:

An example of the power of an algorithm

This is a real example from my former company, Autonavi.

We will do an all-source dependency analysis, and there will be tens of thousands of modules, and one module that depends on another module is called a forward dependency, and one module that is dependent on another module is called a reverse dependency. We will analyze the forward dependencies first and then the reverse dependencies again.

When analyzing reverse dependencies, the previous idea is that, for each dependency, we go through all the modules on one side and find the module that depends on it, and that is its reverse dependency.

It’s a very simple, easy way to think about it, but what’s wrong with it?

The complexity of this algorithm is O(n^2), and if n is in the tens of thousands, the performance will be very poor, and we can estimate that from the complexity.

In fact, it turned out that running through full source dependencies would take us more than 10 hours, or even a night.

If you had to optimize, how would you optimize performance?

Some students may say, can you break down the multi-process/multi-worker thread, break down the dependency analysis task into several parts, that can get several times the performance improvement.

Yeah, that’s a big increase.

But would you believe that if we made a change that improved performance tens of thousands of times?

Here’s how we changed it:

So when we looked at the reverse dependencies we had to go through all the forward dependencies for each dependency. But isn’t positive dependence the opposite of negative dependence?

So we simply changed to analyzing forward dependencies and recording reverse dependencies.

So you don’t have to analyze the reverse dependency alone at all, and the algorithm goes from O(n ^2) to O(n).

A change from O(n^2) to O(n) with tens of thousands of modules is equivalent to tens of thousands of times of performance improvement.

This is reflected in the time of the code that we had to run all night, and now it’s finished in ten minutes. Do you think you can do this with multithreading/process running alone?

That’s the power of the algorithm, when you think of a less complex algorithm, that means a huge improvement in performance.

A detailed framework for dependency analysis can be found in this article:

Autonavi APP full link source code dependency analysis project

The reason we talk about diff all the time is because it reduces O(n^2) to O(n), which means that when you have thousands of DOM nodes, you get thousands of times more performance.

So, feel the power of the algorithm?

conclusion

Multithreading, caching and other means to improve the performance of several times, and the optimization of the algorithm is directly to improve the performance of the order of magnitude, when the amount of data is large, is thousands of times of performance improvement.

So why do we think algorithms don’t work? That’s because the amount of data you’re dealing with is so small, hundreds of numbers, that if you use O(n^2), O(n^3) and O(n), it’s not much different.

The larger the amount of scene data you deal with, the higher the importance of the algorithm, because the difference between a good algorithm and a bad algorithm is not several times as simple as tens of thousands of times, may be tens of thousands of times.

So, you see all these companies taking algorithms, doesn’t that work? No, it’s too important. It directly determines the performance of the code you write.