Lightweight Model family -GhostNet: Cheap operations generate more features

preface

Due to limited memory and computing resources, it is difficult to deploy convolutional neural networks (CNN) on embedded devices. Redundancy in feature graphs is an important feature of those successful CNNS, but has rarely been studied in neural architecture design.

This paper presents a novel Ghost module which can generate more feature maps from cheap operations. The proposed Ghost module can be used as a plug and play component to upgrade the existing convolutional neural network. Stacking Ghost Modules creates a lightweight GhostNet.

GhostNet can achieve higher recognition performance than MobileNetV3 (e.g. 75.7% top-1 accuracy) and has similar computational costs on ImageNet ILSVRC-2012.

GhostNet: More Features from Cheap Operations

Code: github.com/huawei-noah… .

Welcome to pay attention to the public number CV technical guide, focus on computer vision technology summary, the latest technology tracking, classic paper interpretation.

The starting point

Over the years, a series of methods have been proposed to study compact deep neural networks, such as network pruning, low level quantization, knowledge distillation, etc. In the network pruning and pruning neural network, regularization is used to prune the filter to obtain efficient CNN. Low quantization quantifies the weight and activation to 1-bit data to achieve a large compression and acceleration ratio; Distillation of knowledge transfers the edges of knowledge from larger models to smaller models.

However, the performance of these methods is usually capped by pre-trained neural networks as their baseline.

The rich and even redundant information in the feature graph of the trained deep neural network can usually ensure a comprehensive understanding of the input data. For example, the figure above shows some of the features of the input image generated by ResNET-50, and there are many similar feature pairs, like ghosts of each other. Redundancy in feature graphs may be an important feature of successful deep neural networks. We prefer to adopt them rather than avoid redundant feature maps, but in a cost-effective way.

In a well-trained normal-size network, there are a large number of redundant feature graphs. Model pruning (or model compression) and regularization are ways to reduce redundant feature graphs, and this paper believes that these redundant information will play an important role in correct identification or detection.

I recommend reading “Do We Really Need Model Compression” to better understand the above paragraph.

The main contributions

A new Ghost module was introduced that generates more features by using fewer parameters. Specifically, an ordinary convolution layer in a deep neural network is divided into two parts. The first part involves ordinary convolution, but their total number will be strictly controlled. Given the first part of the internal feature map, then apply a series of simple linear operations to generate more feature maps. Compared with ordinary convolutional neural networks, the total number of parameters and computational complexity required by this Ghost module are reduced without changing the size of the output feature graph.

Based on Ghost module, an efficient neural architecture, namely GhostNet, is established. First, the original convolutional layer in the benchmark neural architecture is replaced to prove the validity of Ghost module, and then the superiority of GhostNets on several benchmark visual data sets is verified.

Experimental results show that the proposed Ghost module can reduce the computational cost of general convolution layer while maintaining similar recognition performance, and GhostNets can surpass SOTA efficient depth model on various tasks, such as fast reasoning on MobileNetV3 mobile devices.

Methods

Ghost module

As shown in the figure above, Ghost Module first reduces the number of input channels through normal convolution, and then through a Depthwise convolution and identity(identity transformation).

1. The previous convolution can use either 1×1 convolution or normal 3×3 or 5×5 convolution.

2. φ here is cheap operation, which can be depthwise convolution or convolution in other ways, such as grouping convolution. The function of this part is to generate a similar feature map. That is, a cheaper way to preserve that redundant information.

3. The identity mapping is parallel with the linear transformation in Ghost module to preserve the inherent feature mapping.

Complexity analysis

Suppose the size of the input feature graph is H * W * C, the size of the output feature graph is H ‘* W’ *n, and the size of the convolution kernel is K *k.

In cheap operation transformation, we assume that the channel of the feature graph is M, the number of transformation is S, and the number of the new feature graph finally obtained is N. Then we can get the equation:

N = m ∗ s

Due to the existence of an Identity transformation (Identity) in the final Ghost transformation process, so the actual effective number of transformation is S-1, so the above formula can be obtained as follows:

M ∗ (S − 1) = N/S ∗ (S − 1)

Therefore, the theoretical velocity ratio is:

The theoretical compression ratio is:

Where s is much less than c.

GhostNet

Conclusion

This article comes from the public CV technical guide of the paper sharing series.

Welcome to pay attention to the public number CV technical guide, focus on computer vision technology summary, the latest technology tracking, classic paper interpretation.

Reply keyword “technical summary” in the public account to obtain the summary PDF of the original technical summary article of the public account.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Lightweight Model family –GhostNet: Cheap operations generate more features

The starting point

The main contributions

Methods

Conclusion

Other articles

Lightweight Model family –GhostNet: Cheap operations generate more features

The starting point

The main contributions

Methods

Conclusion

Other articles

Related Posts

Linux cleans up files of specified size

· Text classification using Transfromer model (best model for NLP classification)

Salespeople can’t miss: 5 visual screens to find your sales password