I. Preliminary work

My environment:

  • Language: Python3.6.5
  • Compiler: Jupyter Notebook
  • Deep learning environment: TensorFlow2.4.1

Recommended reading:

  • Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days
  • Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day

From the column:100 Examples of Deep Learning

1. Set the GPU (skip this step if you are using a CPU)

import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")

if gpus:
    gpu0 = gpus[0] # If there are multiple Gpus, use only the 0th GPU
    tf.config.experimental.set_memory_growth(gpu0, True) Set GPU memory usage as required
    tf.config.set_visible_devices([gpu0],"GPU")
Copy the code

2. Import data

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
Copy the code

3. The normalization

Normalize the values of pixels to the range from 0 to 1.
train_images, test_images = train_images / 255.0, test_images / 255.0

train_images.shape,test_images.shape,train_labels.shape,test_labels.shape
"" "output: ((60000, 28, 28), (10000), 28, 28), (60000), (10000) ", ""
Copy the code

4. Visualization

plt.figure(figsize=(20.10))
for i in range(20):
    plt.subplot(5.10,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(train_labels[i])
plt.show()
Copy the code

5. Reformat the image

# Adjust the data to the format we need
train_images = train_images.reshape((60000.28.28.1))
test_images = test_images.reshape((10000.28.28.1))

train_images, test_images = train_images / 255.0, test_images / 255.0

train_images.shape,test_images.shape,train_labels.shape,test_labels.shape
"" "output: ((60000, 28, 28, 1), (10000, 28, 28, 1), (60000), (10000) ", ""
Copy the code

2. Build CNN network model

model = models.Sequential([
    layers.Conv2D(32, (3.3), activation='relu', input_shape=(28.28.1)),# convolution layer 1, convolution kernel 3*3
    layers.MaxPooling2D((2.2)),                   Pooling layer 1,2 *2 sampling
    layers.Conv2D(64, (3.3), activation='relu'),  # convolution layer 2, convolution kernel 3*3
    layers.MaxPooling2D((2.2)),                   # Pool layer 2, 2*2 sampling
    
    layers.Flatten(),                              #Flatten layer, connecting the convolution layer and the full connection layer
    layers.Dense(64, activation='relu'),		   # Full connection layer, further feature extraction
    layers.Dense(10)                               # output layer, output expected results
])
Print the network structure
model.summary()
Copy the code

Compile the model

"" The optimizer, loss function, and metrics are all set here. See my blog: https://blog.csdn.net/qq_38251616/category_10258234.html ""
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
Copy the code

4. Training model

Epochs The input training dataset (images and labels), validation dataset (images and labels), and number of iterations epochs  https://blog.csdn.net/qq_38251616/category_10258234.html """
history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))
Copy the code

Five, the prediction

Through the following network structure, it can be simply understood as: input a picture, and a group of numbers will be obtained, which represents the probability of each number in the picture being 0~9. The larger the out number is, the more likely it will be.

plt.imshow(test_images[1])
Copy the code

Outputs the predicted results for the first image in the test set

pre = model.predict(test_images)
pre[1]
Copy the code

Six, knowledge point detailed explanation

Lenet-5, the simplest CNN model, is used in this paper. If you are first exposed to deep learning, you can first try to run through the code, and then try to understand the code.

1. MNIST handwritten digital data set introduction

MNIST handwritten digital dataset is sourced from the National Institute of Standards and Technology and is one of the famous public datasets. Data set of digital image is composed of 250 people of different professional pure hand draw, the data set to obtain url is: yann.lecun.com/exdb/mnist/ (need) decompression after downloading. (train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()

MNIST handwritten digital data set contains 70,000 images, including 60,000 training data, 10,000 test data, and 70,000 images are all 28*28. The sample data set is as follows:

If we convert the pixels in each picture into vectors, we get length28 * 28 = 784The vector. So we can view the training set as one[60000784]The first dimension represents the index of the picture and the second dimension represents the pixels in each picture. And each pixel in the image has a value between0-1In between.

2. Neural network program description

The neural network program can be briefly summarized as follows:

3. Network structure description

Structure of the model

The role of each layer

  • Input layer: Used to enter data into the training network
  • Convolution layer: use convolution kernel to extract image features
  • Pooling layer: Down-sampling is performed to represent image features with a higher level of abstraction
  • Flatten layer: One-dimensional input is commonly used in the transition from convolution layer to fully connected layer
  • Full connection layer: play the role of “feature extractor”
  • Output layer: Output results

Recommended reading:

  • Depth study of 100 cases (VGG – 19) – convolution neural network to identify the spirit the characters in the cage | 7 days
  • Depth study of 100 cases (VGG – 16) – convolution neural network to identify one piece hat | gang on the sixth day

From the column:100 Examples of Deep Learning