Deep Learning 005- Solving dichotomies in a few lines of Keras code

(Python libraries and versions used in this article: Python 3.6, Numpy 1.14, Scikit-Learn 0.19, Matplotlib 2.2, Keras 2.1.6, Tensorflow 1.9.0)

Many articles and textbooks use the MNIST dataset as the “Hello World” program for deep learning, but this dataset has one big feature: it is a typical multi-classification problem (there are 10 categories in total). When we first started to get into deep learning, I thought we should start with the simplest dichotomy problem.

In terms of deep learning frameworks, currently popular are Tensorflow, Keras, PyTorch, Theano, etc., but I recommend beginners to start with Keras and then move on to Tensorflow as they progress. In fact, The back end of Keras supports Tensorflow and Theano. It can be said that Keras is further encapsulated on the basis of Tensorflow and Theano, which is more simple and practical and easier to get started. Usually a few lines of simple code can solve a small project.

My main reference for this post is: Keras series-Image Multi-classification Training and Fine-tuning with Bottleneck Features (III), This blog post is also a reference to Building powerful image classification models using very little data, but I found that the codes in these two blog posts could not run in many places. The main reason was probably the Keras or Tensorflow upgrade, so I made some necessary changes.

1. Prepare data sets

The most classic binary data set is the “Cat vs. Dog” data set from the Kaggle contest (Train Set, 25K images). 12.5K), here I choose 1000 Dog photos +1000 Cat photos from train_set as our new train set, and 400 Dog photos +400 Cat photos as our new test set according to the practice of the original post. So both train and test have two subfolders (cats and dogs subfolders). Of course, the selection is random and is achieved by code. The code for preparing the small data set is as follows:

def dataset_prepare(raw_set_folder,dst_folder,train_num_per_class=1000,test_num_per_class=400):
    Prepare the small data set, extract the photos of train_num_per_class(each category) from the original raw_set_Folder data set into train, Extract val_num_per_class (for each category) and place it in a Validation folder: param raw_set_folder: JPG or dog.102.jpg :param dst_folder: Place selected images in this folder: param train_num_per_class: param train_num_per_class: :param test_num_per_class: :return: '''
    all_imgs=glob(os.path.join(raw_set_folder,'*.jpg'))
    img_len = len(all_imgs)
    assert img_len > 0.'{} has no jpg image file'.format(raw_set_folder)

    cat_imgs=[]
    dog_imgs=[]
    for img_path in all_imgs:
        img_name=os.path.split(img_path)[1]
        if img_name.startswith('cat'):
            cat_imgs.append(img_path)
        elif img_name.startswith('dog') : dog_imgs.append(img_path) random.shuffle(cat_imgs) random.shuffle(dog_imgs) [ensure_folder_exists(os.path.join(dst_folder,type_folder,class_folder))for type_folder in ['train'.'test']
        for class_folder in ['dogs'.'cats']]
    # The following code can be further optimized...
    for cat_img_path in cat_imgs[:train_num_per_class]: # The first N images are used as train
        _, fname = os.path.split(cat_img_path)  Get the file name and path
        shutil.copyfile(cat_img_path, os.path.join(dst_folder, 'train'.'cats',fname))
    print('imgs saved to train/cats folder')
    for dog_img_path in dog_imgs[:train_num_per_class]:
        _, fname = os.path.split(dog_img_path)  Get the file name and path
        shutil.copyfile(dog_img_path, os.path.join(dst_folder, 'train'.'dogs',fname))
    print('imgs saved to train/dogs folder')
    for cat_img_path in cat_imgs[-test_num_per_class:]: The last M images are used as test
        _, fname = os.path.split(cat_img_path)  Get the file name and path
        shutil.copyfile(cat_img_path, os.path.join(dst_folder, 'test'.'cats',fname))
    print('imgs saved to test/cats folder')
    for dog_img_path in dog_imgs[-test_num_per_class:]: The last M images are used as test
        _, fname = os.path.split(dog_img_path)  Get the file name and path
        shutil.copyfile(dog_img_path, os.path.join(dst_folder, 'test'.'dogs',fname))
    print('imgs saved to test/dogs folder')
    print('finished... ')
Copy the code

Running this function completes the construction of the small data set, so let’s create a picture data stream for Keras to prepare for the model construction.

# 2, prepare the training set. Keras has a number of generators that handle image loading, enhancement, and other operations directly. The package is very good
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator( # Single image processing, train will generally carry out image enhancement
        rescale=1. / 255.The pixel values of the image are 0-255, multiply them by 1/255 and adjust them to 0-1
        shear_range=0.2.# Angle cutting
        zoom_range=0.2.# Zoom in and out
        horizontal_flip=True) # Flip horizontal

train_generator = train_datagen.flow_from_directory(Generate data streams from folders
    train_data_dir, # Training set images folder
    target_size=(IMG_W, IMG_H), # Adjust the size of each image
    batch_size=batch_size,
    class_mode='binary') The mode is binary

# 3, prepare the test set in the same way
val_datagen = ImageDataGenerator(rescale=1. / 255) Only the same scale as Trainset is required, no enhancement is required
val_generator = val_datagen.flow_from_directory(
        val_data_dir,
        target_size=(IMG_W, IMG_H),
        batch_size=batch_size,
        class_mode='binary')
Copy the code

The generator built above is the data flow required by KerAS. This data flow uses flow_from_directory to first load images from image folders (such as train_data_DIR) into memory and then use train_datagen to preprocess and enhance the images. Finally, the batch size data flow after processing is obtained, which will be generated in an infinite cycle until a certain amount of training epoch is reached.

Above, ImageDataGenerator is used to enhance the image, and the parameters are described as follows :(please refer to the official Keras document)

Rotation_range is a degree from 0 to 180 that specifies the Angle of a randomly selected image.

Width_shift and height_shift are used to specify the degree of random horizontal and vertical movement, which is the ratio between two zeros and ones.

The rescale value will be multiplied over the entire image before any other processing is performed. Our image is all integers from 0 to 255 in the RGB channel. This operation can make the image value too high or too low, so we will set this value to a number between 0 and 1.

Shear_range is the degree of shear-transform used

Zoom_range is used for random zooming

Horizontal_flip Randomly flisses an image horizontally. This parameter is used when flipping horizontally does not affect the image’s semantics

Fill_mode is used to specify how to fill newly generated pixels when pixel filling such as rotation, horizontal and vertical displacement is required

2. Build and train Keras model

Keras has encapsulated many Tensorflow functions, so it is easier to use, of course, if you want to adjust the structure of the model and parameters, etc., it is more difficult, so for the master, want to adjust the structure of the model and customize some functions, you can directly use Tensorflow.

2.1 Construction of Keras model

No matter Keras model or Tensorflow model, I personally think its construction includes two parts: model construction and model configuration, so we can build a small model from these two aspects. The code is as follows:

# 4. Keras Model building: Model building mainly includes model building and model configuration
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import optimizers
def build_model(input_shape):
    # Model building: A structure of three CNN layers +2 fully connected layers is constructed here
    model = Sequential()
    model.add(Conv2D(32, (3.3), input_shape=input_shape))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2.2)))

    model.add(Conv2D(32, (3.3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2.2)))

    model.add(Conv2D(64, (3.3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2.2)))

    model.add(Flatten())
    model.add(Dense(64))
    model.add(Activation('relu'))
    model.add(Dropout(0.5)) # Dropout prevents overfitting
    model.add(Dense(1)) We cannot use Dense(2) because activation is sigmoID. This function can output only one value, the probability of class_0
    model.add(Activation('sigmoid')) # Binary problems with activation function sigmoID
    
    # Model configuration
    model.compile(loss='binary_crossentropy'.# Loss Func, Optimizer,
                  optimizer=optimizers.RMSprop(lr=0.0001),
                  metrics=['accuracy'])# Main optimization accuracy
    # Loss function for dichotomous problems uses binary_Crossentropy, where accuracy is used as the optimization objective
    return model # return the built model
Copy the code

This function builds the structure of the model and configures the model, including loss Function, Optimzer, optimization target, etc. Of course, more configurations can be made.

Here, for simple explanation, only a small network structure of three convolution layers + two full connection layers is established. Of course, this small model can also solve some relatively simple image problems. If you need to build a more complex model, you can simply customize this function and modify the model building and configuration methods within it.

2.2 Model training

Since we are using generator here to generate data flow, the fit_generator function is used for training. The code is as follows:

model=build_model(input_shape=(IMG_W,IMG_H,IMG_CH)) # Input image dimensions
# Model training
model.fit_generator(train_generator, # the data flow
                    steps_per_epoch=train_samples_num // batch_size, 
                    epochs=epochs,
                    validation_data=val_generator,
                    validation_steps=val_samples_num // batch_size)
Copy the code

Since I was training on my laptop, I didn’t have a stand-alone graphics card, let alone a Nvidia graphics card, so the speed was slow, but it worked. The specific results of the run can be seen on my Github.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — – a — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Epoch 1/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 136 – s, 2 s/step – loss: 0.6976 acc: 0.5015 – val_loss: 0.6937 – val_acc: 0.5000 Epoch 2/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 137 – s, 2 s/step – loss: 0.6926 acc: 0.5131 – val_loss: 0.6846 – val_acc: 0.5813 Epoch 3/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 152 – s, 2 s/step – loss: 0.6821-ACC: 0.5544-VAL_loss: 0.6735-val_ACC: 0.6100

Epoch 18/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 140 – s, 2 s/step – loss: 0.5776 acc: 0.6880 – val_loss: 0.5615 – val_acc: 0.7262 Epoch 19/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 143 – s, 2 s/step – loss: 0.5766 acc: 0.6971 – val_loss: 0.5852 – val_acc: 0.6800 Epoch 20/20 62/62 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] – 140 – s, 2 s/step – loss: 0.5654 – ACC: 0.7117 – val_loss: 0.5374 – val_ACC: 0.7450

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — – — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — –

It can be roughly seen from the loss and ACC after training that loss keeps decreasing and ACC keeps increasing with a relatively stable trend.

Here, we can plot the loss and ACC in the training process to see their change trend.

Draw the acc and loss during training on the map
import matplotlib.pyplot as plt
%matplotlib inline
def plot_training(history):
    plt.figure(12)
    
    plt.subplot(121)
    train_acc = history.history['acc']
    val_acc = history.history['val_acc']
    epochs = range(len(train_acc))
    plt.plot(epochs, train_acc, 'b',label='train_acc')
    plt.plot(epochs, val_acc, 'r',label='test_acc')
    plt.title('Train and Test accuracy')
    plt.legend()
    
    plt.subplot(122)
    train_loss = history.history['loss']
    val_loss = history.history['val_loss']
    epochs = range(len(train_loss))
    plt.plot(epochs, train_loss, 'b',label='train_loss')
    plt.plot(epochs, val_loss, 'r',label='test_loss')
    plt.title('Train and Test loss')
    plt.legend()
 
    plt.show()
Copy the code

Obviously, both ACC and Loss failed to reach the plateau due to the small number of epochs, so the epochs can be increased to achieve a better result. In the original blog post, the author achieved an accuracy rate of about 80% after 50 epochs, while here my accuracy rate after 20 epochs was 74%.

2.3 Predicting new samples

Predictions for individual images

Once the model has been trained, it needs to be used to predict new images and see if it can give accurate results. The prediction function is:

# Use trained models to predict new samples
from PIL import Image
from keras.preprocessing import image
def predict(model, img_path, target_size):
    img=Image.open(img_path) # loading images
    ifimg.size ! = target_size: img = img.resize(target_size) x = image.img_to_array(img) x *=1./255 # ImageDataGenerator(rescale=1. / 255)
    x = np.expand_dims(x, axis=0) Adjust the dimensions of the image
    preds = model.predict(x) # prediction
    return preds[0]
Copy the code

Use this function to predict a single image:

predict(model,'E:\PyProjects\DataSet\FireAI\DeepLearning/FireAI005/cat11.jpg',(IMG_W,IMG_H))

predict(model,'E:\PyProjects\DataSet\FireAI\DeepLearning//FireAI005/dog4.jpg',(IMG_W,IMG_H))
Copy the code

Array ([0.14361556], dtype = float32)

Array ([0.9942463], dtype = float32)

It can be seen that for a single image, the probability of CAT11.jpg is 0.14, while that of DOG4.jpg is 0.99. It can be seen that the 0th category is dog and the first category is CAT, and the model can be distinguished well.

Prediction of multiple images

What if you wanted to use this model to predict all the images in a folder?

# Predict all images in a folder
new_sample_gen=ImageDataGenerator(rescale=1. / 255)
newsample_generator=new_sample_gen.flow_from_directory(
        'E:\PyProjects\DataSet\FireAI\DeepLearning',
        target_size=(IMG_W, IMG_H),
        batch_size=16,
        class_mode=None,
        shuffle=False)
predicted=model.predict_generator(newsample_generator)
print(predicted)
Copy the code

Belonging to 2 classes. [[0.14361556] [0.5149474] [0.71455824] [0.9942463]]

In the above result, the second 0.5149 should correspond to CAT, which should be less than 0.5. This prediction is wrong, but the rough estimate is correct 3/4=75%.

2.4 Model saving and loading

The model should be saved to the hard disk in time to prevent data loss, the following is the code to save:

# Save the model
# model. Save_weights (' E: \ PyProjects \ DataSet \ FireAI \ DeepLearning / / FireAI005 FireAI005_Model h5 ') # this preserve only weights, do not save the model structure
model.save('E:\PyProjects\DataSet\FireAI\DeepLearning//FireAI005/FireAI005_Model2.h5') For a complete model, this should be saved
Copy the code

# Model loading, prediction
from keras.models import load_model
saved_model=load_model('E:\PyProjects\DataSet\FireAI\DeepLearning//FireAI005/FireAI005_Model2.h5')

predicted=saved_model.predict_generator(newsample_generator)
print(predicted) The result of # saved_model is the same as the previous model result, the surface model is saved and loaded correctly
Copy the code

The results obtained here are exactly the same as predicted by the model above, indicating that the model is saved and loaded correctly.

# # # # # # # # # # # # # # # # # # # # # # # # small * * * * * * * * * * and # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

1. This article explains: Prepare a simple small data set, build a data stream from the data set, introduce the data stream into Keras’s model for training, use the trained model to predict new images, and then save the model and load the saved model into memory.

2. The model used here is built by ourselves and has a relatively simple structure, with only three convolution layers and two fully connected layers. Therefore, the accuracy of the model is not high.

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

Note: This part of the code has been uploaded to (my Github), welcome to download.

Deep Learning 005- Solving dichotomies in a few lines of Keras code