Deep learning 010-Keras Fine Tuning to Improve Performance (Multi-classification problem)

(Python libraries and versions used in this article: Python 3.6, Numpy 1.14, Scikit-Learn 0.19, Matplotlib 2.2, Keras 2.1.6, Tensorflow 1.9.0)

In the previous article ([Furnace-ai] Deep learning 007-Keras fine-tuning to further improve performance), we conducted fine-tune for binary classification problems using Keras, which further improved the accuracy of the model. Here we look at how fine-tune can be used to improve performance for multi-classification problems.


1. Prepare data sets

Just as deep learning 008-KERas solves multiple categorical problems, flow_from_directory requires class_mode to be ‘categorical’.


2. Fine-tune the second half of VGG16

The model we built here is to use VGG16’s body (inclue_top=False) as the feature extractor, and our own head (weights and weights trained in Keras transfer learning to improve performance (multi-classification problem)). An accuracy rate of 0.96 has been achieved in this article. However, in the migration study of the previous article, we did not modify the weights parameter of the VGG16 network, but directly used it to extract features. The fine-tune here is to adjust the weights of the higher convolutional layer of the VGG16 network to make it more suitable for our own project.

The following is the function of model construction:

# 4. Build the model
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
from keras import optimizers
from keras.models import Model
def build_model(a):
    base_model = applications.VGG16(weights='imagenet', include_top=False,input_shape=(IMG_W, IMG_H,IMG_CH))
    Inclue_top =False, inclue_top=False, input_shape must be set, otherwise error will be reported
    # This step uses the VGG16 function of the Applications module to directly load the model and parameters as the "body" of our own model.
    
    # Define our own classifier as the "head" of our own model.
    top_model = Sequential()
    top_model.add(Flatten(input_shape=base_model.output_shape[1:])) 
    top_model.add(Dense(256, activation='relu'))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(class_num, activation='softmax')) # Multiple classification problem
    
    top_model.load_weights(os.path.join(save_folder,'top_FC_model')) 
    The model structure is defined above, where the trained parameters should be loaded.
    
    my_model = Model(inputs=base_model.input, outputs=top_model(base_model.output)) # Assemble "body" and "head" together
    # my_model is the complete model we have assembled, with its weights loaded
    
    # Normal model requires training to adjust the weights of all layers, but here we only adjust the last few convolution layers of VGG16, so the first convolution layer should be frozen
    for layer in my_model.layers[:15] :No training is required until level 15
        layer.trainable = False
        
    # Model configuration
    my_model.compile(loss='categorical_crossentropy',
                  optimizer=optimizers.SGD(lr=1e-4, momentum=0.9), # Use a very small LR for fine tuning
                  metrics=['accuracy'])
    return my_model
Copy the code

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — – a — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Start to fine – most cerebral sci-film my model Epoch 1/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 124 – s 16 s/step – loss: 0.0170 acc: 0.9950 – val_loss: 0.2767 – val_acc: 0.9700 Epoch 2/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 131 – s 16 s/step – loss: 3.2684E-04 – ACC: 1.0000-val_loss: 0.2694 – val_ACC: 0.9700 Epoch 3/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 131 – s 16 s/step – loss: 0.0175 acc: 0.9950 – val_loss: 0.2593 – val_acc: 0.9700

.

Epoch 48/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 132 – s 16 s/step – loss: 0.0025 acc: 1.0000 – val_loss: 0.2758 – val_acc: 0.9700 Epoch 49/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 130 – s 16 s/step – loss: 0.0080 acc: 0.9950 – val_loss: 0.2922 – val_acc: 0.9700 Epoch 50/50 8/8 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 131 – s 16 s/step – loss: 4.7076E-04 – ACC: 1.0000 – VAL_loss: 0.2875 – VAL_ACC: 0.9700

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –

Acc and Loss:

It can be seen that the accuracy rate of TEST ACC is always around 0.97, and acc and Loss change little at the beginning and end of training, indicating that the performance improvement of fine-tuning is not obvious for this project, because the data set used in this project is too small, which is prone to over-fitting, and these problems can be solved by increasing the data set.

# # # # # # # # # # # # # # # # # # # # # # # # small * * * * * * * * * * and # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

1. When fine-tune is used to solve the problem of multiple categorization, loss should be changed to categorical_crossentropy, and SGD optimizer should be used to use a very small learning rate to prevent lr from greatly changing the network structure of the previous convolutional layer.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #


Note: This part of the code has been uploaded to (my Github), welcome to download.