Preface:

At present, the mainstream direction of the model has achieved a high accuracy, you can think of methods, basically predecessors have done, and also do very well, so the more difficult the future paper is more and more difficult to issue, innovation is more and more difficult to find.

So how do you find your innovation? How to improve on the basis of predecessors? Several ideas will be provided in this article.

Common ideas I’ve given them a few names: much ado about nothing.

1. Add some noise to the original data set, such as random occlusion, or adjust the saturation brightness, mainly according to the specific task to increase the noise or disturbance, do not mess with it. If it’s getting a lot less accurate, then you’re thinking about how to keep the model accurate in the case of occlusion or noise or whatever. (Much ado about nothing)

For example, there may be some special situations in target detection, such as dogs poking out their heads from the other side of the wall, and over-exposure under the light at night.

2. Use its model to try out the data set of a new scenario, since its original model is likely to be overfitted. If the accuracy is severely reduced in the new scene, the idea is how to improve the generalization ability of the model to achieve high precision in the new scene. (Much ado about nothing)

Each field of deep learning has been developing for several years. In the past, there were no data sets. In recent years, some new, larger and more data sets have emerged, and the previous models may not work well on the new data sets.

The characteristic of this point is to analyze the characteristics of the data set of the original model, find their characteristics, and then find the idea of improvement.

3. Think about its existing problems, such as too large model, too slow reasoning speed, too long training time, slow convergence speed, etc. Generally there is a problem with this, and other problems are associated with it. If the above problems exist, you can think about how to increase the speed of reasoning, or greatly reduce the number of parameters or calculations, or speed up convergence without losing as much precision as possible. (The waves behind push on the waves ahead)

4. Consider whether the model is too complex, for example: too much manual design, too much post-processing, too much tuning. Based on these situations, you can consider how to design an end-to-end model. In the design process, it is certain that the training effect is not good, at this time, you need to design some new methods, this method is your innovation. (The waves behind push on the waves ahead)

5. Replace some new structures and introduce some technologies from other directions, such as Transformer, feature pyramid technology, etc. This aspect is mainly to pay more attention to some related technology, cutting-edge technology, all directions of the content of the suggestion more attention. (Bring forth the new through the old)

6. Try to do some specific detection or identification. In order to ensure the generalization ability, the general model often detects and identifies multiple classes, resulting in the low recognition accuracy of each class. So you might consider only detecting or identifying a particular class. Take behavior recognition, for example. Some generic models can recognize dozens of actions, but you can specialize in fall detection. In this case you can add a lot of prior knowledge to the model, such as multitasking learning. In other words, your model is designed specifically for falls and therefore tends to be more accurate. (Surprise)

Note: This particular type of detection had better have some application prospects, let a person feel realistic can have.

Some people on Zhihu say that opening up new directions is more advanced than innovation and improvement in models. I do not agree with that, but I do not deny that opening up new fields will bring contributions. It can only be said that they are equal, because they are solving problems in reality. Of course, this paragraph is not the point, just by the way to express personal views, the following is the point.

For readers without a mentor, or in general, it is not advisable to take a new direction, because it is very likely that the top reviewer will not agree that the direction you are taking is valuable and meaningful, and it will all be wasted if the expert does not agree after a few months. But what I suggest is to do the new direction that was just opened up in the last year or two. Since it is only a year or two, there are not many people doing it (of course, once this article is written, it is none of my business whether everyone is trying to find a new direction in this way or not), and it is more likely to produce results.

So how do you find these directions that are just one or two years old? Go to the top meeting of recent one or two years search, because the top meeting has meant to have got expert approbate, this direction does go down is meaningful, be worth studying.

The above are some targeted ideas. The original approach should be to write a review after reading important papers in the direction. In the process of writing, some problems will be found, not necessarily to compare the accuracy with the SOTA model, but to solve the existing problems in this direction. Solving the existing problems is the key, is the core value of the paper, otherwise it is only a dozen pages of waste paper.

For example, the aforementioned realization of lightweight, improvement of reasoning speed, realization of real-time detection, and design of end-to-end model all belong to solving the existing problems in this direction. Further improvement of accuracy is also to solve the problems. In addition, there are some other problems, which can only be analyzed according to specific tasks.

If you still don’t have ideas after writing a review, on the one hand, it is suggested to try the above ideas, and on the other hand, it is suggested to find some classical papers related to your direction and read them. These four words are the most important. Of course, in order for reading to be effective, one of the most important prerequisites is that you have enough knowledge, otherwise it doesn’t matter what you think.

In many cases, adding modules in other directions to the model, such as using plug-and-play modules, attention mechanisms, etc., can improve the model to a certain extent, which is considered innovative. But there needs to be a reasonable explanation of why it works and what it solves. Sometimes a small improvement to the original model leads to a big improvement, which is also innovation. The premise is that the lift is stable, across multiple data sets, and not a one-off event.

But in this case, it’s easy to finish reasoning and proof in one or two pages, not enough words, not enough work to become a paper, how to make this innovation into a complete paper? What explains why it works? I recommend the introduction to CNN Visualization Technology Summary (PART 1).

For the problem of adding noise or random occlusion, you can refer to the public number “CV technical Guide” technical summary series of “Data enhancement method summary”

For the methods of lightweight, or speeding up reasoning speed, or reducing the number of parameters, you can refer to the “SUMMARY of CNN Structural Evolution (II) Lightweight Model” and “Summary of CNN Structural Evolution (III) Design Principles” of the technical summary series.

For expanded aspect of knowledge, it is suggested that attention to the public, CV technology guide paper series, the inside of the articles are through strict screening to share out, generally speaking reader can learn something new, or cause some direction of thinking to himself, then there will be some past classic papers, not just the latest top papers.

Other articles

Summary of attention Mechanism

Summary of feature pyramid

Summary of data enhancement methods

Summary of CNN visualization technology

Summary of CNN structure evolution — classical model

Summary of CNN structural evolution — lightweight model

Summary of CNN structure evolution — Design principles

Summary of pooling techniques

Summary of non-maximum suppression

Summary of English literature reading methods

Summary of common ideas of paper innovation

Normalization method summary | aka “BN and its wave after the”

This article comes from the public CV technical guide technical summary series.

A summary PDF of all of the above summary articles is available by replying to “Technical Summary” in the CV Technical Guide