This is the fourth day of my participation in the August More text Challenge. For details, see: August More Text Challenge

Deep Learning with Python for computer vision

This article is my first study of Deep Learning with Python (second edition, Chapter 5 Deep Learning for Computer Vision Chapter 5 Deep Learning for Computer Vision

This post marks the turn from Jupyter Notebooks to Markdown, you can find the original.ipynb notebooks on my GitHub or Gitee notebooks.

You can read the original copy of the book online (in English) at this website. The book’s author also released the accompanying Jupyter notebooks.

Introduction to convolutional neural networks

5.1 the Introduction to convnets lies

Convolutional neural networks are great for computer vision problems.

Let’s start with an example of the simplest convolutional neural network dealing with the fully connected network in Chapter 2 of MNIST:

from tensorflow.keras import layers
from tensorflow.keras import models

model = models.Sequential()

model.add(layers.Conv2D(32, (3.3), activation='relu', input_shape=(28.28.1)))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(64, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(64, (3.3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10, activation="softmax"))
Copy the code

Input_shape: image_height, image_width, image_channels

The output of Conv2D and MaxPooling2D layers are 3D tensors (height, width, channels), height and width decrease layer by layer, and the channels are controlled by the first parameter of Conv2D.

In the last three layers, we transform the tensors of the last Conv2D layer (3, 3, 64) into the desired result vector with a series of fully connected layers: The Flatten layer is used to Flatten our 3D tensor to 1D. And then the latter two layers of Dense that we did in Chapter 2, and you end up with a 10-way classification.

Finally, take a look at the model structure:

model.summary()
Copy the code

Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_3 (Conv2D) (None, 26, 26, 32) 320 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 13, 13, 32) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 11, 11, 64) 18496 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64) 0 _________________________________________________________________ conv2d_5 (Conv2D) (None, 3, 3, 64) 36928 _________________________________________________________________ flatten_1 (Flatten) (None, 576) 0 _________________________________________________________________ dense_2 (Dense) (None, 64) 36928 _________________________________________________________________ dense_3 (Dense) (None, 10) 650 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Total params: 93322 Trainable params: 93322 Non - trainable params: 0 _________________________________________________________________Copy the code

Ok, this is how the network is built, it’s still easy, and then it’s 0, much the same as in Chapter 2 (note that the shape is different) :

# Load the TensorBoard notebook extension
# TensorBoard can visualize the training process
%load_ext tensorboard
# Clear any logs from previous runs! rm -rf ./logs/Copy the code

# Train convolutional neural networks on MNIST images

import datetime
import tensorflow as tf

from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000.28.28.1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000.28.28.1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# prepared TensorBoard
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

model.compile(optimizer='rmsprop',
              loss="categorical_crossentropy",
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5, batch_size=64,
          callbacks=[tensorboard_callback])
Copy the code

Train on 60000 samples Epoch 1/5 60000/60000 [==============================] - 36s 599us/sample - loss: 0.0156 accuracy: 0.9953 Epoch 2/5 60000/60000 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 33 s 554 us/sample - loss: 0.0127 accuracy: 0.9960 Epoch 3/5 60000/60000 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 31 s 524 us/sample - loss: 0.0097 accuracy: 0.9971 Epoch 4/5 60000/60000 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 32 s 529 us/sample - loss: 0.0092 accuracy: 0.9974 Epoch 5/5 60000/60000 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 31 s 523 us/sample - loss: 0.0095 accuracy: 0.9971Copy the code

%tensorboard –logdir logs/fit

Let’s look at the results in the test set:

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(test_loss, test_acc)
Copy the code

Output:

10000/1-1s-loss: 0.0172 - accuracy: 0.9926 0.03441549262946125 0.9926Copy the code

convolution

Convolutional neural network

The dense connection layer we used earlier learned global patterns in the entire input feature space (which in MNIST is all pixels); The convolution layer here is learning local patterns. In other words, Dense is learning the whole image, while Conv is learning the parts of the image, such as in the code we just wrote is learning 3×3 window:

This convolutional neural network has two properties:

The patterns that a convolutional neural network learns are translation invariant: when a convolutional neural network learns a pattern, it sees the same pattern elsewhere, and it recognizes that it has learned it, and does not have to learn it again. As for the Dense network, even if it meets the same part, it still needs to learn again. This property allows the CONVOLUtional neural network to use data efficiently. It needs fewer training samples to learn a better generalization of the data representation (remember each part, instead of mapping the whole).
The convolutional neural network can learn spatial hierarchies of patterns: After learning small local patterns in the first layer, the next layer can use those small sections to create larger ones. The convolutional neural network can learn more and more complex and abstract visual concepts, which means the following picture:

Convolution layer

The 3D tensor used to represent the image in our example consists of two spatial axes, height and width, and a depth axis (also called the channels axis). For RGB images, the depth axis has dimension 3, which represents the three colors. For MNIST gray image, the depth is 1, and only one number is used to represent the gray value. The result of the convolution between such a 3D tensor and the operation performed on it is called a feature map.

The convolution operation extracts small blocks from the input feature map and applies the same transformation to all the blocks to obtain the output feature map. The output feature is still a 3D tensor: it has a width and a height, its depth can be any value, the depth is a parameter to the layer, and each channel in the depth axis represents a filter. A filter encodes an aspect of the input data. For example, a filter could encode the concept that the input contains a face.

In the MNIST example just mentioned, the first convolution layer accepts an input feature of size (28, 28, 1) and outputs a feature of size (26, 26, 32). The output contains 32 filters. The channel in each depth axis contains 26×26 values, which is called the response map of the filter to the input, indicating the calculation results of the filter at different positions in the input. This is why the feature map is called a feature map: each dimension of the depth axis is a feature (or filter), and the 2D tensor output[:, :, n] is a two-dimensional spatial map of the response of the filter on the input.

Convolution operation

I don’t understand much about convolution, EMMM and complex variation. I mainly read “Zhihu: How to explain convolution in an easy way?” To understand. Here we mainly use this function:

The initialization of Keras’s Conv2D layer is written as:

Conv2D(output_depth, (window_height, window_width))
Copy the code

The convolution operation has two core parameters:

The depth of the output feature: we used 32 and 64 in the MNIST example;
Size of each block (slide window) extracted from the input: typically 3×3 or 5×5;

The convolution operation will go through all possible locations like a sliding window, Take the feature (window_height, window_width, input_depth) of each piece of input and dot it with a weight matrix to be learned called the convolution kernel. You get a vector (output_depth,). All of these result vectors are put together to produce a final 3D output (height, width, output_depth), where each value corresponds to the input, such as taking a 3×3 slide window, Output [I, j, :] is from input[i-1: I +1, j-1:j+1, :].

Convolutional Neural Networks – Basics, An Introduction to CNNs and Deep Learning

Note that because of border effects and strides used, the width and height of our output may be different from the width and height of our input.

Boundary effects and filling

The boundary effect is that the size of the matrix will be reduced by one circle after you do the slide window. For example, now input a 5×5 picture, take 3×3 small pieces can only take out 9 pieces, so the output result is 3×3:

We did the same thing with MNIST before, where we started with 28×28, and we took 3×3 on the first layer, and we got 26×26.

If you don’t want this reduction to happen, that is, you want to keep the spatial dimension of the output consistent with the input, then you need to use padding. Fill is to add rows and columns to the input image boundary, 3×3 plus 1 turn, 5×5 plus 2 turn:

The padding in the Keras Conv2D layer can be set using the padding parameter. Padding can be set to:

"valid"(Default) : do not fill, only “valid” blocks. For example, in the 5×5 input feature map, the effective position of 3×3 block can be extracted;
"same": to fill, so that the output and input width and height are equal.

Convolution stride

The stride of the convolution is how much you move in a slide, and all we’ve been doing so far is stride one. We call a convolution with a stride greater than 1 strided convolution. For example, here is one with a stride of 2:

Now, step convolution is not used very much in practice 😂, and to do this downsampling of a characteristic graph we usually use maximum pooling.

Note:

Subsampling: Sampling a sequence of samples several times apart, so that the new sequence is the subsampling of the original sequence.

From Baidu Encyclopedia

The biggest pooling

Like stepping convolution, maximum pooling is used to sample the feature graph. In the initial MNIST example, the size of the feature map was halved after we used the MaxPooling2D layer.

Maximum pooling is a window that takes the maximum value of each channel from the input feature map and outputs it. This operation is very similar to convolution, but the function applied is a Max.

Maximum pooling is usually done with a 2×2 window with a stride of 2, so that you can sample the feature graph twice. (Convolution is generally take 3×3 window and step 1)

If you pile up a bunch of convolution layers without maximal pooling, there are two problems:

The size of the feature map decreases slowly, and too many parameters behind it will aggravate the over-fitting;
It is not good for learning spatial hierarchy: it is not good to see the more abstract whole by learning the convolutional layer bit by bit.

In addition to maximum pooling, there are many ways to downsample: such as stepping convolution, average pooling, and so on. But it’s usually better to use maximum pooling, and we want to know if there’s a feature, and if you use average you’re dodging it, and if you use step convolution you’re missing that information.

In summary, the reason for using maximum pooling/other subsampling is to reduce the number of elements of the feature map that need to be processed, and to learn about spatial hierarchy by allowing a series of convolution layers to observe larger and larger Windows (seeing the original input covering an increasing proportion).

Train a convolutional neural network from scratch on a small data set

5.2 Training a convnet from scratch on a small dataset

When we do computer vision, we often have to deal with the problem of training an image classification model on a very small data set. Emmm, “small” here can be anywhere from hundreds to tens of thousands.

From here through the next few sections, we’ll be training a small model from scratch, using a pre-trained network for feature extraction, and fine-tuning the pre-trained network, all of which can be used to solve image classification problems with small data sets.

What we’re going to do in this section is train a small model from scratch to sort pictures of cats and dogs. Let’s not do regularization, let’s forget about overfitting.

Download the data

We will train the model using the Dogs vs. Cats Dataset, which is a large collection of photos of Cats and Dogs. This data set is not built into Keras, we can download it from Kaggle: www.kaggle.com/c/dogs-vs-c…

Download it and decompress it,,, (emmmm is a little too big for my MBP, put it on the mobile hard disk 😂, emmmm, and I think it’s time to buy a new solid).

Then we create the data sets we will use: 1,000 for each dog and cat training set, 500 for each validation set, and 500 for each test set. Program to do this:

Copy the image to the training, Validation, and Test directory
import os, shutil

original_dataset_dir = '/Volumes/WD/Files/dataset/dogs-vs-cats/dogs-vs-cats/train'    # Raw data set

base_dir = '/Volumes/WD/Files/dataset/dogs-vs-cats/cats_and_dogs_small'    The location of the smaller data set to be saved
os.mkdir(base_dir)


Create a few directories to put into a divided training, validation, and test set
train_dir = os.path.join(base_dir, 'train')
os.mkdir(train_dir)
validation_dir = os.path.join(base_dir, 'validation')
os.mkdir(validation_dir)
test_dir = os.path.join(base_dir, 'test')
os.mkdir(test_dir)

# Separate cats and dogs
train_cats_dir = os.path.join(train_dir, 'cats')
os.mkdir(train_cats_dir)

validation_cats_dir = os.path.join(validation_dir, 'cats')
os.mkdir(validation_cats_dir)

test_cats_dir = os.path.join(test_dir, 'cats')
os.mkdir(test_cats_dir)

Change cat to dog.# Copy cat pictures
fnames = [f'cat.{i}.jpg' for i in range(1000)]    'cat.{}.jpg'.format(I)
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(train_cats_dir, fname)
    shutil.copyfile(src, dst)
    
fnames = [f'cat.{i}.jpg' for i in range(1000.1500)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(validation_cats_dir, fname)
    shutil.copyfile(src, dst)
    
fnames = [f'cat.{i}.jpg' for i in range(1500.2000)]
for fname in fnames:
    src = os.path.join(original_dataset_dir, fname)
    dst = os.path.join(test_cats_dir, fname)
    shutil.copyfile(src, dst)
    
# Duplicate dog: Similar to the picture of the cat above, just change the cat to dog..# check
print('total training cat images:'.len(os.listdir(train_cats_dir)))
print('total validation cat images:'.len(os.listdir(validation_cats_dir)))
print('total test cat images:'.len(os.listdir(test_cats_dir)))

print('total training dog images:'.len(os.listdir(train_dogs_dir)))
print('total validation dog images:'.len(os.listdir(validation_dogs_dir)))
print('total test dog images:'.len(os.listdir(test_dogs_dir)))
Copy the code

Output results:

total training cat images: 1000
total validation cat images: 500
total test cat images: 500
total training dog images: 1000
total validation dog images: 500
total test dog images: 500
Copy the code

To build the network

In almost all convolutional neural networks, we let the depth of the feature map gradually increase and the size gradually decrease. So we’re doing the same thing this time.

Our current problem is a dichotomy, so the last layer is Dense with a 1-unit sigmoid activated:

# Instantiate a small convolutional neural network that categorizes cats and dogs

from tensorflow.keras import layers
from tensorflow.keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3.3), activation='relu', input_shape=(150.150.3)))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(64, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(128, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(128, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
Copy the code

Look at the structure of the network:

model.summary()
Copy the code

Model: "sequential_4" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_16 (Conv2D) (None, 148, 148, 32) 896 _________________________________________________________________ max_pooling2d_16 (MaxPooling (None, 74, 74, 32) 0 _________________________________________________________________ conv2d_17 (Conv2D) (None, 72, 72, 64) 18496 _________________________________________________________________ max_pooling2d_17 (MaxPooling (None, 36, 36, 64) 0 _________________________________________________________________ conv2d_18 (Conv2D) (None, 34, 34, 128) 73856 _________________________________________________________________ max_pooling2d_18 (MaxPooling (None, 17, 17, 128) 0 _________________________________________________________________ conv2d_19 (Conv2D) (None, 15, 15, 128) 147584 _________________________________________________________________ max_pooling2d_19 (MaxPooling (None, 7, 7, 128) 0 _________________________________________________________________ flatten_4 (Flatten) (None, 6272) 0 _________________________________________________________________ dense_8 (Dense) (None, 512) 3211776 _________________________________________________________________ dense_9 (Dense) (None, 1) 513 = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Total params: 3453121 Trainable params: 3453121 Non - trainable params: 0 _________________________________________________________________Copy the code

Then we need to compile the network, so binary crossentropy is the loss function, and the optimizer is RMSprop (optimizer=’ RMSprop ‘). So use optimizers.RMSprop instance) :

from tensorflow.keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])
Copy the code

Data preprocessing

We need to turn those images into floating point tensor to feed the neural network. The steps are as follows:

Read image file
Decodes the JPEG file contents into an RGB pixel grid
Convert to a floating point tensor
Take the value of the pixel from[0, 255]Zoom to the[0, 1]

Keras provides some tools to automate this:

# Read images from directory using ImageDataGenerator

from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150.150),
    batch_size=20,
    class_mode='binary')    # with dichotomous hashtags

validation_generator = test_datagen.flow_from_directory(
    validation_dir,
    target_size=(150.150),
    batch_size=20,
    class_mode='binary')
Copy the code

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Copy the code

The resulting train_generator and validation_Generator are Python generators, lazy computations. This generator yields one batch at a time, so call it a “batch generator” and iterate over one. Here’s a look:

for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    print('labels_batch:', labels_batch)
    break
Copy the code

data batch shape: (20, 150, 150, 3)
labels batch shape: (20,)
labels_batch: [1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 0. 0. 1. 1. 0. 1.]
Copy the code

# Use batch generator to fit the model
history = model.fit_generator(
    train_generator,
    steps_per_epoch=100,
    epochs=30,
    validation_data=validation_generator,
    validation_steps=50)
Copy the code

Run training:

Epoch 1/30 100/100 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 97-967 ms/s step - loss: 0.6901 acc: 0.5450 - val_loss: 0.6785-val_acc: 0.5270...... Epoch 30/30 100/100 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 162 - s, 2 s/step - loss: 0.0460 acc: 0.9885 - val_loss: 1.0609 - val_acc: 0.7150Copy the code

Because batch is read from the generator to fit, the fit we usually use is changed to FIT_generator. The training data generator, the number of rounds to yield from the train_Generator (steps_per_epoch), the rounds, the validation set generator, and the rounds to yield from the Validation_generator The number of times to come out (validation_steps).

Steps_per_epoch = Total number of training set data/Batch_size specified when the generator is builtCopy the code

Validation_steps is similar to steps_per_EPOCH, except for the validation set.

Save the trained model with this line of code:

# Save model
model.save('/Volumes/WD/Files/dataset/dogs-vs-cats/cats_and_dogs_small_1.h5')
Copy the code

Then let’s draw a picture of the training process:

# Draw the loss curve and accuracy curve during training
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1.len(acc) + 1)

plt.plot(epochs, acc, 'bo-', label='Training acc')
plt.plot(epochs, val_acc, 'sr-', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo-', label='Training loss')
plt.plot(epochs, val_loss, 'sr-', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()
Copy the code

As expected, it’s been over fitting since about round 5.

Next, we need to reduce overfitting with data augmentation.

Data to enhance

Data augmentation is a common approach to processing images with deep learning.

Overfitting is the result of training too little data (as long as the sample is large enough, the model can see almost all possible, and thus almost no mistake). Data enhancement is a method of generating more training data based on existing samples. This method uses a variety of random transformations to generate trusted images.

In Keras, we use ImageDataGenerator to enhance the data by setting a few parameters:

datagen = ImageDataGenerator(
      rotation_range=40.# Randomly rotate the image range from 0 to 180
      width_shift_range=0.2.# Proportion of random horizontal moves
      height_shift_range=0.2.# Proportion of random vertical moves
      shear_range=0.2.# The angles of random shearing transforms
      zoom_range=0.2.# Random scale range
      horizontal_flip=True.Whether to do random horizontal inversion
      fill_mode='nearest')    # Fill the newly created pixel
Copy the code

Find a picture to enhance try:

from tensorflow.keras.preprocessing import image

fnames = [os.path.join(train_cats_dir, fname) for 
          fname in os.listdir(train_cats_dir)]

img_path = fnames[3]
img = image.load_img(img_path, target_size=(150.150))    # Read image

x = image.img_to_array(img)    # shape (150, 150, 3)
x = x.reshape((1,) + x.shape)  # shape (1, 150, 150, 3)

i=0
for batch in datagen.flow(x, batch_size=1):
    plt.figure(i)
    imgplot = plt.imshow(image.array_to_img(batch[0]))
    i += 1
    if i % 4= =0:
        break

plt.show()
Copy the code

Note that data enhancements don’t bring new information, they just remix information that already exists. Therefore, in the case of very little data, optical data enhancement is not enough to eliminate overfitting, so we also need to use Dropout before the Dense layer.

# Define a new convolutional neural network with Dropout

model = models.Sequential()
model.add(layers.Conv2D(32, (3.3), activation='relu', input_shape=(150.150.3)))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(64, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(128, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Conv2D(128, (3.3), activation='relu'))
model.add(layers.MaxPooling2D((2.2)))

model.add(layers.Flatten())

model.add(layers.Dropout(0.5))    # 👈 added Dropout

model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', 
              optimizer=optimizers.RMSprop(lr=1e-4), 
              metrics=['acc'])
Copy the code

# Use data enhancement generator to train convolutional neural networks

# Data generator
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

test_datagen = ImageDataGenerator(rescale=1./255)    # The test set is not enhanced

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150.150),
    batch_size=32,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_dir,
    target_size=(150.150),
    batch_size=32,
    class_mode='binary')

# training
history = model.fit_generator(
    train_generator,
    steps_per_epoch=100,
    epochs=100,
    validation_data=validation_generator,
    validation_steps=50)

# Save model
model.save('/Volumes/WD/Files/dataset/dogs-vs-cats/cats_and_dogs_small_2.h5')
Copy the code

Found 2000 images belonging to 2 classes. Found 1000 images belonging to 2 classes. Epoch 1/100 100/100 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 1 s/step - 142 - s - loss: 0.6909 acc: 0.5265 - val_loss: 0.6799 - val_acc: 0.5127... Epoch 100/100 100/100 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] 1 s/step - 130 - s - loss: 0.3140 acc: 0.8624 - val_loss: 0.4295 - val_acc: 0.8274Copy the code

# Draw the loss curve and accuracy curve during training
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1.len(acc) + 1)

plt.plot(epochs, acc, 'bo-', label='Training acc')
plt.plot(epochs, val_acc, 'r-', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo-', label='Training loss')
plt.plot(epochs, val_loss, 'r-', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()
Copy the code

With Data Augmentation and Dropout, 👌 has improved overfitting and accuracy.

Next, we will use some techniques to further optimize the model.

【 To be continued 】

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Deep Learning for Computer Vision in Python (part 1)

Deep Learning with Python for computer vision

Introduction to convolutional neural networks

convolution

Convolutional neural network

Convolution layer

Convolution operation

Boundary effects and filling

Convolution stride

The biggest pooling

Train a convolutional neural network from scratch on a small data set

Download the data

To build the network

Data preprocessing

Data to enhance

Deep Learning for Computer Vision in Python (part 1)

Deep Learning with Python for computer vision

Introduction to convolutional neural networks

convolution

Convolutional neural network

Convolution layer

Convolution operation

Boundary effects and filling

Convolution stride

The biggest pooling

Train a convolutional neural network from scratch on a small data set

Download the data

To build the network

Data preprocessing

Data to enhance

Related Posts

Python data structures and algorithms (1) — Enum

ICCV2021 | to rethink the count and positioning in the crowd: a purely based on the framework

Tmall Genie voice development – the second day