Over the past few years, Deep learning is used in solving problems such as visual recognition, speech recognition and natural language processing Processing and many other problems show very good performance. Among the different types of deep neural networks, convolutional neural network is the most deeply studied. In the early stage, it was difficult to train high-performance convolutional neural networks without overfitting due to lack of training data and computing power. Tagging data and the recent development of Gpus have led to the emergence of convolutional neural network research and first-class results. In this paper, we will review the recent development of convolutional neural networks and introduce some applications of convolutional neural networks in visual recognition.

The rapid development of CNN is due to the emergence of different structures such as Lenet-5, Alexnet, ZFNet, VGGNet, GoogleNet and ResNet.

1. Basic structure of CNN

There are many structures of convolutional neural networks, but their basic structures are similar. Take Lenet-5 as an example, as shown in the figure below. It contains three main layers — convolutional layer, pooling layer and fully connected layer.

 

The convolutional network workflow in the figure is as follows. The input layer is composed of 32×32 sensing nodes to receive the original image. The computational flow then alternates between convolution and subsampling, as described below:

The first hidden layer is convolved, which is composed of 6 feature maps, each feature map is composed of 28×28 neurons, and each neuron specifies a 5×5 receptive domain.

The second hidden layer, which implements sub-sampling and local averaging, also consists of 6 feature maps, but each feature map consists of 14×14 neurons. Each neuron has a 2×2 receptive field, a trainable coefficient, a trainable bias, and a SigmoID activation function. Trainable coefficients and bias control the operating points of neurons.

The third hidden layer carries out the second convolution, which is composed of 16 feature maps, each of which is composed of 10×10 neurons. Each neuron in this hidden layer may have synaptic connections to several feature maps of the next hidden layer, which operate in a similar manner to the first convolution layer.

The fourth hidden layer carries out the second subsampling and local average calculation. It consists of 16 feature maps, but each feature map consists of 5×5 neurons, and it operates in a similar manner to the first sampling.

The fifth hidden layer implements the final stage of convolution and consists of 120 neurons, each assigned a 5×5 receptive field.

Finally, it’s a fully connected layer, and you get the output vector.

Successive computational layers alternate between convolution and sampling, and we get a “double steeple” effect, that is, at each convolution or sampling layer, the number of feature maps increases compared to the corresponding previous layer as the spatial resolution decreases. The idea of subsampling followed by convolution was inspired by the idea of “simple” cells followed by “complex” cells in the animal visual system.

The convolution layer is used to learn the characteristic representation of input data. The convolutional layer is composed of many convolutional kernels, which are used to compute different feature maps.

Activation function introduced nonlinearity to CNN convolutional neural network, such as Sigmoid, TANH and ReLU functions.

The pooling layer reduces the feature vector output by the convolutional layer and improves the results (making the structure less prone to over-fitting). Typical applications include average pooling and Max pooling.

After stacking the convolutional layer and Pooling layer, one or multiple fully connected layers can be formed, so as to achieve high-order thrust capacity

 

2. Improvement strategy of CNN convolutional neural network

Since the success of AlexNet in 2012, researchers have put forward many methods to improve CNN, basically starting from the following six aspects: Convolutional layer, pooling layer, activation function, loss function, regularization, optimization

2.1 convolutional layer

1). Network in network

It replaces the linear filter of the convolutional layer by a micro network

 

Inception Module: an extension that inherits NIN. Use different size filters to capture different size Visual patterns.

2.2 pooling layer

Pooling layer is an important part of CNN, which reduces the complexity of computation by reducing the number of connections between convolutional layers.

 

Lp pooling is based on the operation mechanism of complex cells and is inspired by biology

 

2). Mixed pooling: combination of Max pooling and average pooling

 

3).Stochastic pooling: Stochastic pooling is a method inspired by Droptout

 

4).Spectral pooling

 

5). Spatial pyramid pooling

 

Spatial pyramid pooling can transform the convolution features of images at any scale into the same dimension, which not only enables CNN to process images at any scale, but also avoids the cropping and warping operation, which leads to the loss of some information, which is of great significance.

 

General CNN all need to the size of the input image is fixed, it is because the connection layer input needs to be fixed input dimension and the convolution operation but has no the image scale restricted, pyramid pooling all the author puts forward the space, let image convolution operation, and then into the characteristics of the dimension input to full connection layer, This can expand CNN to any size image.

2.3 the activation function

An appropriate excitation function can effectively improve the performance of CNN. Commonly used nonlinear activation functions include SigmoID, TANH, RELU, etc. Sigmoid/TANH of the former two are more common in the full link layer, while relU of the latter is more common in the convolution layer.

Sigmoid function image has been introduced in neural network and deep learning (I), and the characteristic curves of ReLU, LReLU, PReLU, RReLU and ELU excitation functions are introduced here

 

2.4 Loss function

1). Softmax loss :

2). Hinge loss : to train large margin classifiers such as SVM

3). Contrastive loss : to train Siamese network

2.5 Regularization

The over-fitting problem of CNN can be effectively reduced through regularization. Here are two regularization techniques, Dropout and DropConnect

Dropout is when the weights of some hidden layer nodes of the network are randomly disabled during model training. The inactive nodes can be temporarily considered not part of the network structure, but their weights should be retained (just not updated). DropConnect is a further development of DropOut. Instead of randomly setting the value of the output neuron to 0, we randomly set the weight matrix W to 0.

2.6 Optimization

Here are several key technologies to optimize CNN:

1). Weights initialization

2). Stochastic gradient descent

3). Batch Normalization

4). Shortcut connections

3. Accelerate CNN calculation speed

3.1 FFT: Using the Fast Fourier transform, some elements can be reused, such as the Fourier transform of the output gradient

3.2 Matrix Factorization: Matrix Factorization can reduce the amount of computation to accelerate the training of CNN

3.3 Vector quantization: Vector quantization (VQ) is used to compress the dense connection layer, making the CNN model smaller

4. Main applications of CNN

Using CNN, the following applications can achieve optimal state-of-art performance:

1). Image Classification

2). Object Tracking

3). Pose Estimation

4). Text Detection

5). Visual Saliency detection

6). Action Recognition

7). Scene Labeling

function varargout = System_Main(varargin)

gui_Singleton = 1;
gui_State = struct('gui_Name',       mfilename, ...
                   'gui_Singleton',  gui_Singleton, ...
                   'gui_OpeningFcn', @System_Main_OpeningFcn, ...
                   'gui_OutputFcn',  @System_Main_OutputFcn, ...
                    'gui_LayoutFcn',  [] , ...
                   'gui_Callback',   []);
if nargin && ischar(varargin{1})
    gui_State.gui_Callback = str2func(varargin{1});
end

if nargout
    [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
    gui_mainfcn(gui_State, varargin{:});
    
end


function System_Main_OpeningFcn(hObject, eventdata, handles, varargin)
handles.output = hObject;
guidata(hObject, handles);
movegui(hObject,'center');




function varargout = System_Main_OutputFcn(hObject, eventdata, handles) 

varargout{1} = handles.output;



%打开图片菜单
function OpenPicture_Callback(hObject, eventdata, handles)
[pathname filename]=uigetfile('.jpg','选择图片');
chepailujing=[pathname filename];
handles.chepailujing=chepailujing;
fpath=[filename pathname];
axes(handles.axes1);
im = imread(fpath);
%im=imresize(im,[240,320])
imshow(im);
title('原图像', 'FontWeight', 'Bold');
handles.IM=im;
guidata(hObject, handles);
fpath;
%保存图片菜单
function SavePicture_Callback(hObject, eventdata, handles)





% --- 车牌定位实现按钮
function CarDiwei_Callback(hObject, eventdata, handles)
[PY2,PY1,PX2,PX1]=caitu_fenge(handles.IM);%分割方法
axes(handles.axes1); hold on;
row = [PY1 PY2];
col = [PX1 PX2];
plot([col(1) col(2)], [row(1) row(1)], 'g-', 'LineWidth', 3);
plot([col(1) col(2)], [row(2) row(2)], 'g-', 'LineWidth', 3);
plot([col(1) col(1)], [row(1) row(2)], 'g-', 'LineWidth', 3);
plot([col(2) col(2)], [row(1) row(2)], 'g-', 'LineWidth', 3);
hold off;
I_bai=handles.IM;
[PY2,PY1,PX2,PX1,threshold]=SEC_xiuzheng(PY2,PY1,PX2,PX1);%选着车牌位置
handles.threshold=threshold;
Plate=I_bai(PY1:PY2,PX1:PX2,:);
bw=Plate;
handles.bw=bw;
guidata(hObject,handles);
axes(handles.axes2);
imshow(bw);
title('车牌图像');



% --- Executes on button press in CarXuanzhuan.
function CarXuanzhuan_Callback(hObject, eventdata, handles)
bw=rgb2gray(handles.bw);

qingxiejiao=rando_bianhuan(bw);
handles.qingxiejiao=qingxiejiao;
bw=imrotate(bw,qingxiejiao,'bilinear','crop');
axes(handles.axes3);
imshow(bw);
title('车牌调整角度图像');
bw=im2bw(bw,graythresh(bw));%figure,imshow(bw);
bw=bwmorph(bw,'hbreak',inf);%figure,imshow(bw);
bw=bwmorph(bw,'spur',inf);%figure,imshow(bw);title('擦除之前');
bw=bwmorph(bw,'open',5);%figure,imshow(bw);title('闭合运算');
handles.bw=bw;
guidata(hObject,handles);


% --- Executes on button press in EditBlue.
function EditBlue_Callback(hObject, eventdata, handles)
bw = bwareaopen(handles.bw, handles.threshold);
bw=~bw;
bw=touying(bw);%Y方向处理
bw=~bw;%擦除反色
bw = bwareaopen(bw, handles.threshold);
bw=~bw;%二次擦除
handles.bw=bw;
guidata(hObject,handles);
axes(handles.axes4);
imshow(bw);
title('擦除多余蓝色');


% --- Executes on button press in EditCar.
function EditCar_Callback(hObject, eventdata, handles)
bw=handles.bw;
[y,x]=size(bw);%对长宽重新赋值
%=================文字分割=================================
fenge=shuzifenge(bw,handles.qingxiejiao)
[m,k]=size(fenge);
%=================显示分割图像结果========================= 
figure;
for s=1:2:k-1
    subplot(1,k/2,(s+1)/2);imshow(bw( 1:y,fenge(s):fenge(s+1)));
end



function CarShibie_Callback(hObject, eventdata, handles)
bw=handles.bw;
[y,x]=size(bw);
fenge=shuzifenge(bw,handles.qingxiejiao)
[m,k]=size(fenge);
%================ 给七张图片定位===============桂AV6388
han_zi  =bw( 1:y,fenge(1):fenge(2));
zi_mu   =bw( 1:y,fenge(3):fenge(4));
zm_sz_1 =bw( 1:y,fenge(5):fenge(6));
zm_sz_2 =bw( 1:y,fenge(7):fenge(8));  
shuzi_1 =bw( 1:y,fenge(9):fenge(10)); 
shuzi_2 =bw( 1:y,fenge(11):fenge(12)); 
shuzi_3 =bw( 1:y,fenge(13):fenge(14)); 
%==========================识别====================================
%======================把修正数据读入==============================
xiuzhenghanzi =   imresize(han_zi, [110 55],'bilinear');
xiuzhengzimu  =   imresize(zi_mu,  [110 55],'bilinear');
xiuzhengzm_sz_1=  imresize(zm_sz_1,[110 55],'bilinear');
xiuzhengzm_sz_2 = imresize(zm_sz_2,[110 55],'bilinear');
xiuzhengshuzi_1 = imresize(shuzi_1,[110 55],'bilinear');
xiuzhengshuzi_2 = imresize(shuzi_2,[110 55],'bilinear');
xiuzhengshuzi_3 = imresize(shuzi_3,[110 55],'bilinear');
%============ 把0-9 , A-Z以及省份简称的数据存储方便访问====================
hanzishengfen=duquhanzi(imread('picture/cpgui.bmp'),imread('picture/cpguizhou.bmp'),imread('picture/cpjing.bmp'),imread('picture/cpsu.bmp'),imread('picture/cpyue.bmp'));
%因数字和字母比例不同。这里要修改
shuzizimu=duquszzm(imread('picture/0.bmp'),imread('picture/1.bmp'),imread('picture/2.bmp'),imread('picture/3.bmp'),imread('picture/4.bmp'),...
                   imread('picture/5.bmp'),imread('picture/6.bmp'),imread('picture/7.bmp'),imread('picture/8.bmp'),imread('picture/9.bmp'),...
                   imread('picture/10.bmp'),imread('picture/11.bmp'),imread('picture/12.bmp'),imread('picture/13.bmp'),imread('picture/14.bmp'),...
                   imread('picture/15.bmp'),imread('picture/16.bmp'),imread('picture/17.bmp'),imread('picture/18.bmp'),imread('picture/19.bmp'),...
                   imread('picture/20.bmp'),imread('picture/21.bmp'),imread('picture/22.bmp'),imread('picture/23.bmp'),imread('picture/24.bmp'),...
                   imread('picture/25.bmp'),imread('picture/26.bmp'),imread('picture/27.bmp'),imread('picture/28.bmp'),imread('picture/29.bmp'),...
                   imread('picture/30.bmp'),imread('picture/31.bmp'),imread('picture/32.bmp'),imread('picture/33.bmp'));
zimu  = duquzimu(imread('picture/10.bmp'),imread('picture/11.bmp'),imread('picture/12.bmp'),imread('picture/13.bmp'),imread('picture/14.bmp'),...
                 imread('picture/15.bmp'),imread('picture/16.bmp'),imread('picture/17.bmp'),imread('picture/18.bmp'),imread('picture/19.bmp'),...
                 imread('picture/20.bmp'),imread('picture/21.bmp'),imread('picture/22.bmp'),imread('picture/23.bmp'),imread('picture/24.bmp'),...
                 imread('picture/25.bmp'),imread('picture/26.bmp'),imread('picture/27.bmp'),imread('picture/28.bmp'),imread('picture/29.bmp'),...
                 imread('picture/30.bmp'),imread('picture/31.bmp'),imread('picture/32.bmp'),imread('picture/33.bmp'));
shuzi = duqushuzi(imread('picture/0.bmp'),imread('picture/1.bmp'),imread('picture/2.bmp'),imread('picture/3.bmp'),imread('picture/4.bmp'),...
                 imread('picture/5.bmp'),imread('picture/6.bmp'),imread('picture/7.bmp'),imread('picture/8.bmp'),imread('picture/9.bmp')); 
%============================识别结果================================  
i=1;%shibiezm_sz该函数识别数字有问题
jieguohanzi  = shibiehanzi(hanzishengfen,xiuzhenghanzi);shibiejieguo(1,i) =jieguohanzi;  i=i+1;
jieguozimu   = shibiezimu(zimu,xiuzhengzimu);           shibiejieguo(1,i) =jieguozimu;   i=i+1;
jieguozm_sz_1= shibiezm_sz(shuzizimu,xiuzhengzm_sz_1);  shibiejieguo(1,i) =jieguozm_sz_1;i=i+1;
jieguozm_sz_2= shibiezm_sz(shuzizimu,xiuzhengzm_sz_2);  shibiejieguo(1,i) =jieguozm_sz_2;i=i+1;
jieguoshuzi_1= shibieshuzi(shuzi,xiuzhengshuzi_1);      shibiejieguo(1,i) =jieguoshuzi_1;i=i+1;
jieguoshuzi_2= shibieshuzi(shuzi,xiuzhengshuzi_2);      shibiejieguo(1,i) =jieguoshuzi_2;i=i+1;
jieguoshuzi_3= shibieshuzi(shuzi,xiuzhengshuzi_3);      shibiejieguo(1,i) =jieguoshuzi_3;i=i+1;
%==========================对话框显示显示=============================================
set(handles.Result,'String', shibiejieguo);
handles.shibiejieguo=shibiejieguo;
guidata(hObject,handles);

% --- Executes on button press in CarVoide.
function CarVoide_Callback(hObject, eventdata, handles)
duchushengyin(handles.shibiejieguo);


% --- Executes on button press in CNNbut.
function CNNbut_Callback(hObject, eventdata, handles)
Copy the code

Complete code or write to add QQ1575304183