First, preliminary work

In this paper, resnet-50 will be used to realize the recognition and classification of bird pictures

My environment:

Locale: Python3.6.5
Compiler: Jupyter Notebook
Deep learning environment: TensorFlow2.4.1
Data address: [Portal]

🚀 from column: [100 Examples of deep Learning]

1. Set the GPU

Comment out this section of code if you are using a CPU.

import tensorflow as tf

gpus = tf.config.list_physical_devices("GPU")

if gpus:
    tf.config.experimental.set_memory_growth(gpus[0].True)  # Set GPU memory usage as required
    tf.config.set_visible_devices([gpus[0]],"GPU")
Copy the code

2. Import data

import matplotlib.pyplot as plt
# Support Chinese
plt.rcParams['font.sans-serif'] = ['SimHei']  # is used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False  # is used to display the negative sign normally

import os,PIL

# Set random seeds to reproduce the results as much as possible
import numpy as np
np.random.seed(1)

# Set random seeds to reproduce the results as much as possible
import tensorflow as tf
tf.random.set_seed(1)

from tensorflow import keras
from tensorflow.keras import layers,models

import pathlib
Copy the code

data_dir = "D:/jupyter notebook/DL-100-days/datasets/bird_photos"

data_dir = pathlib.Path(data_dir)
Copy the code

3. View data

image_count = len(list(data_dir.glob('* / *')))

print("The total number of pictures is:",image_count)
Copy the code

Total number of pictures: 565Copy the code

2. Data preprocessing

folder	The number of
Bananaquit	166
Black Throated Bushtiti	111
Black skimmer	122
Cockatoo	166

1. Load data

Use the image_DATASet_from_directory method to load the data from the disk into tf.data.dataset

batch_size = 8
img_height = 224
img_width = 224
Copy the code

TensorFlow version 2.2.0 students may encounter the module ‘TensorFlow. Keras. Preprocessing’ has no attribute ‘image_dataset_from_directory’ error, Just upgrade TensorFlow.

"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code

Found 565 files belonging to 4 classes.
Using 452 files for training.
Copy the code

"" "about image_dataset_from_directory () articles detailing can refer to: https://mtyjkh.blog.csdn.net/article/details/117018789, "" "
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code

Found 565 files belonging to 4 classes.
Using 113 files for validation.
Copy the code

We can output the labels of the dataset through class_names. The labels will correspond alphabetically to the directory name.

class_names = train_ds.class_names
print(class_names)
Copy the code

['Bananaquit', 'Black Throated Bushtiti', 'Black skimmer', 'Cockatoo']
Copy the code

2. Visualize data

plt.figure(figsize=(10.5))  The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")

for images, labels in train_ds.take(1) :for i in range(8):
        
        ax = plt.subplot(2.4, i + 1)  

        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        
        plt.axis("off")
Copy the code

plt.imshow(images[1].numpy().astype("uint8"))
Copy the code

3. Check the data again

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
Copy the code

(8, 224, 224, 3)
(8,)
Copy the code

Image_batchIs the tensor of the shape (8, 224, 224, 3). This is a batch of 8 images with the shape 240x240x3 (the last dimension refers to the color channel RGB).
Label_batchIs the tensor of the shape (8,), and these labels correspond to 8 pictures

4. Configure the data set

Shuffle () : disturb data, detailed introduction about this function can be reference: zhuanlan.zhihu.com/p/42417456
Prefetch () : The process of prefetching data to speed up a run is described in my previous two articles.
Cache () : The data set is cached in memory to speed up operation

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
Copy the code

If the AttributeError: Module ‘tensorflow._api.v2.data’ has no attribute ‘AUTOTUNE’ error, The AUTOTUNE = tf. Data. AUTOTUNE replacement for AUTOTUNE = tf. Data. The experimental. AUTOTUNE

Three, residual network (ResNet) introduction

1. What does the residual network solve

Residual network is to solve the problem of neural network degradation caused by too many hidden layers. The problem of degradation refers to that when the hidden layer of the network becomes too large, the accuracy of the network reaches saturation and then deteriorates rapidly, and the degradation is not caused by overfitting.

Extension: “Two clouds” of Deep Neural Networks

Gradient dispersion/explosion

To put it simply, the network is too deep, which will lead to the convergence of model training. This problem can be effectively controlled by the methods of standard initialization and middle layer normalization. (That’s good to know for now)

Network degradation

As the depth of the network increases, the performance of the network increases gradually to saturation and then decreases rapidly, and this degradation is not caused by overfitting.

2. ResNet – 50 is introduced

Resnet-50 has two basic blocks, called Conv Block and Identity Block

Conv Block structure:

Identity Block structure:

Overall structure of RESNET-50:

Iv. Build the resnet-50 network model

This is the focus of this article. Try building your own Resnet-50 using the above three figures

from keras import layers

from keras.layers import Input,Activation,BatchNormalization,Flatten
from keras.layers import Dense,Conv2D,MaxPooling2D,ZeroPadding2D,AveragePooling2D
from keras.models import Model

def identity_block(input_tensor, kernel_size, filters, stage, block) :

    filters1, filters2, filters3 = filters

    name_base = str(stage) + block + '_identity_block_'

    x = Conv2D(filters1, (1.1), name=name_base + 'conv1')(input_tensor)
    x = BatchNormalization(name=name_base + 'bn1')(x)
    x = Activation('relu', name=name_base + 'relu1')(x)

    x = Conv2D(filters2, kernel_size,padding='same', name=name_base + 'conv2')(x)
    x = BatchNormalization(name=name_base + 'bn2')(x)
    x = Activation('relu', name=name_base + 'relu2')(x)

    x = Conv2D(filters3, (1.1), name=name_base + 'conv3')(x)
    x = BatchNormalization(name=name_base + 'bn3')(x)

    x = layers.add([x, input_tensor] ,name=name_base + 'add')
    x = Activation('relu', name=name_base + 'relu4')(x)
    return x


def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2.2)) :

    filters1, filters2, filters3 = filters

    res_name_base = str(stage) + block + '_conv_block_res_'
    name_base = str(stage) + block + '_conv_block_'

    x = Conv2D(filters1, (1.1), strides=strides, name=name_base + 'conv1')(input_tensor)
    x = BatchNormalization(name=name_base + 'bn1')(x)
    x = Activation('relu', name=name_base + 'relu1')(x)

    x = Conv2D(filters2, kernel_size, padding='same', name=name_base + 'conv2')(x)
    x = BatchNormalization(name=name_base + 'bn2')(x)
    x = Activation('relu', name=name_base + 'relu2')(x)

    x = Conv2D(filters3, (1.1), name=name_base + 'conv3')(x)
    x = BatchNormalization(name=name_base + 'bn3')(x)

    shortcut = Conv2D(filters3, (1.1), strides=strides, name=res_name_base + 'conv')(input_tensor)
    shortcut = BatchNormalization(name=res_name_base + 'bn')(shortcut)

    x = layers.add([x, shortcut], name=name_base+'add')
    x = Activation('relu', name=name_base+'relu4')(x)
    return x

def ResNet50(input_shape=[224.224.3],classes=1000) :

    img_input = Input(shape=input_shape)
    x = ZeroPadding2D((3.3))(img_input)

    x = Conv2D(64, (7.7), strides=(2.2), name='conv1')(x)
    x = BatchNormalization(name='bn_conv1')(x)
    x = Activation('relu')(x)
    x = MaxPooling2D((3.3), strides=(2.2))(x)

    x =     conv_block(x, 3[64.64.256], stage=2, block='a', strides=(1.1))
    x = identity_block(x, 3[64.64.256], stage=2, block='b')
    x = identity_block(x, 3[64.64.256], stage=2, block='c')

    x =     conv_block(x, 3[128.128.512], stage=3, block='a')
    x = identity_block(x, 3[128.128.512], stage=3, block='b')
    x = identity_block(x, 3[128.128.512], stage=3, block='c')
    x = identity_block(x, 3[128.128.512], stage=3, block='d')

    x =     conv_block(x, 3[256.256.1024], stage=4, block='a')
    x = identity_block(x, 3[256.256.1024], stage=4, block='b')
    x = identity_block(x, 3[256.256.1024], stage=4, block='c')
    x = identity_block(x, 3[256.256.1024], stage=4, block='d')
    x = identity_block(x, 3[256.256.1024], stage=4, block='e')
    x = identity_block(x, 3[256.256.1024], stage=4, block='f')

    x =     conv_block(x, 3[512.512.2048], stage=5, block='a')
    x = identity_block(x, 3[512.512.2048], stage=5, block='b')
    x = identity_block(x, 3[512.512.2048], stage=5, block='c')

    x = AveragePooling2D((7.7), name='avg_pool')(x)

    x = Flatten()(x)
    x = Dense(classes, activation='softmax', name='fc1000')(x)

    model = Model(img_input, x, name='resnet50')
    
    # Load the pre-training model
    model.load_weights("resnet50_weights_tf_dim_ordering_tf_kernels.h5")

    return model

model = ResNet50()
model.summary()
Copy the code

Model: "resnet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
zero_padding2d (ZeroPadding2D)  (None, 230, 230, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 64) 9472        zero_padding2d[0][0]             
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 112, 112, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation (Activation)         (None, 112, 112, 64) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
max_pooling2d (MaxPooling2D)    (None, 55, 55, 64)   0           activation[0][0]                 
__________________________________________________________________________________________________
2a_conv_block_conv1 (Conv2D)    (None, 55, 55, 64)   4160        max_pooling2d[0][0]              
__________________________________________________________________________________________________
2a_conv_block_bn1 (BatchNormali (None, 55, 55, 64)   256         2a_conv_block_conv1[0][0]        
__________________________________________________________________________________________________
2a_conv_block_relu1 (Activation (None, 55, 55, 64)   0           2a_conv_block_bn1[0][0]          
__________________________________________________________________________________________________
2a_conv_block_conv2 (Conv2D)    (None, 55, 55, 64)   36928       2a_conv_block_relu1[0][0]        
__________________________________________________________________________________________________
2a_conv_block_bn2 (BatchNormali (None, 55, 55, 64)   256         2a_conv_block_conv2[0][0]        
__________________________________________________________________________________________________
2a_conv_block_relu2 (Activation (None, 55, 55, 64)   0           2a_conv_block_bn2[0][0]          
__________________________________________________________________________________________________
2a_conv_block_conv3 (Conv2D)    (None, 55, 55, 256)  16640       2a_conv_block_relu2[0][0]        
__________________________________________________________________________________________________
2a_conv_block_res_conv (Conv2D) (None, 55, 55, 256)  16640       max_pooling2d[0][0]              
__________________________________________________________________________________________________
2a_conv_block_bn3 (BatchNormali (None, 55, 55, 256)  1024        2a_conv_block_conv3[0][0]        
__________________________________________________________________________________________________
2a_conv_block_res_bn (BatchNorm (None, 55, 55, 256)  1024        2a_conv_block_res_conv[0][0]     
__________________________________________________________________________________________________
2a_conv_block_add (Add)         (None, 55, 55, 256)  0           2a_conv_block_bn3[0][0]          
                                                                 2a_conv_block_res_bn[0][0]       
__________________________________________________________________________________________________
2a_conv_block_relu4 (Activation (None, 55, 55, 256)  0           2a_conv_block_add[0][0]          
__________________________________________________________________________________________________
2b_identity_block_conv1 (Conv2D (None, 55, 55, 64)   16448       2a_conv_block_relu4[0][0]        
__________________________________________________________________________________________________
2b_identity_block_bn1 (BatchNor (None, 55, 55, 64)   256         2b_identity_block_conv1[0][0]    

     =============================================================
            此处省略了若干行，此处省略了若干行，此处省略了若干行
     =============================================================
__________________________________________________________________________________________________
5c_identity_block_relu2 (Activa (None, 7, 7, 512)    0           5c_identity_block_bn2[0][0]      
__________________________________________________________________________________________________
5c_identity_block_conv3 (Conv2D (None, 7, 7, 2048)   1050624     5c_identity_block_relu2[0][0]    
__________________________________________________________________________________________________
5c_identity_block_bn3 (BatchNor (None, 7, 7, 2048)   8192        5c_identity_block_conv3[0][0]    
__________________________________________________________________________________________________
5c_identity_block_add (Add)     (None, 7, 7, 2048)   0           5c_identity_block_bn3[0][0]      
                                                                 5b_identity_block_relu4[0][0]    
__________________________________________________________________________________________________
5c_identity_block_relu4 (Activa (None, 7, 7, 2048)   0           5c_identity_block_add[0][0]      
__________________________________________________________________________________________________
avg_pool (AveragePooling2D)     (None, 1, 1, 2048)   0           5c_identity_block_relu4[0][0]    
__________________________________________________________________________________________________
flatten (Flatten)               (None, 2048)         0           avg_pool[0][0]                   
__________________________________________________________________________________________________
fc1000 (Dense)                  (None, 1000)         2049000     flatten[0][0]                    
==================================================================================================
Total params: 25,636,712
Trainable params: 25,583,592
Non-trainable params: 53,120
__________________________________________________________________________________________________
Copy the code

Five, compile,

Before you are ready to train the model, you need to set it up a little more. The following was added in the build step of the model:

Loss function: Used to measure the accuracy of the model during training.
Optimizer: Determines how the model is updated based on the data it sees and its own loss function.
Metrics: Used to monitor training and testing steps. The following example uses accuracy, which is the ratio of images that are correctly classified.

# Set optimizer, I changed the learning rate here.
opt = tf.keras.optimizers.Adam(learning_rate=1e-7)

model.compile(optimizer="adam",
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
Copy the code

6. Training model

epochs = 10

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs
)
Copy the code

Epoch 1/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 12 86 ms/s step - loss: 2.4313 accuracy: 0.6548 - val_loss: 213.7383 - val_accuracy: 0.3186 Epoch 2/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 0.4293-accuracy: 0.8557-val_loss: 9.0470-val_accuracy: 0.2566 Epoch 3/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 0.2309 accuracy: 0.9183 - val_loss: 1.4181 - val_accuracy: 0.7080 Epoch 4/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 53 ms/step 3 s - loss: 0.1721 - accuracy: 0.9535 - val_loss: 2.5627 - val_accuracy: 0.6726 Epoch 5/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 53 ms/step 3 s - loss: 0.0795 accuracy: 0.9701 - val_loss: 0.2747 - val_accuracy: 0.8938 Epoch 6/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: Accuracy: 0.1483-accuracy: 0.9899-val_loss: 0.1483-val_accuracy: 0.9381 Epoch 7/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 0.0308 accuracy: 0.9970 - val_loss: 0.1705 - val_accuracy: 0.9381 Epoch 8/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 0.0019 - accuracy: 1.0000 - val_loss: 0.0674-val_accuracy: 0.9735 Epoch 9/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 8.2391 e-04 - accuracy: 1.0000-val_accuracy: 0.0720-val_accuracy: 0.9735 Epoch 10/10 57/57 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 3 s 52 ms/step - loss: 6.0079 e-04 - accuracy: 1.0000-val_loss: 0.0762-val_accuracy: 0.9646Copy the code

Vii. Model evaluation

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(12.4))
plt.subplot(1.2.1)
plt.suptitle("Wechat official Account: STUDENT K")

plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1.2.2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
Copy the code

Save and load the model

This is the simplest way to save and load a model

# Save model
model.save('model/my_model.h5')
Copy the code

# Load model
new_model = keras.models.load_model('model/my_model.h5')
Copy the code

Nine, forecasting

# Use the loaded model (new_model) to see the prediction results

plt.figure(figsize=(10.5))  The width of the figure is 10 and the height is 5
plt.suptitle("Wechat official Account: STUDENT K")

for images, labels in val_ds.take(1) :for i in range(8):
        ax = plt.subplot(2.4, i + 1)  
        
        # Display images
        plt.imshow(images[i].numpy().astype("uint8"))
        
        # Need to add a dimension to the image
        img_array = tf.expand_dims(images[i], 0) 
        
        Use the model to predict the people in the picture
        predictions = new_model.predict(img_array)
        plt.title(class_names[np.argmax(predictions)])

        plt.axis("off")
Copy the code

🚀 from column: [100 Examples of deep Learning]

If you find this article helpful, remember to follow it, like it, or add it to your favorites

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Depth study of 100 cases – convolution neural network (ResNet – 50) bird identification | 8 days

First, preliminary work

1. Set the GPU

2. Import data

3. View data

2. Data preprocessing

1. Load data

2. Visualize data

3. Check the data again

4. Configure the data set

Three, residual network (ResNet) introduction

1. What does the residual network solve

2. ResNet – 50 is introduced

Iv. Build the resnet-50 network model

Five, compile,

6. Training model

Vii. Model evaluation

Save and load the model

Nine, forecasting

Depth study of 100 cases – convolution neural network (ResNet – 50) bird identification | 8 days

First, preliminary work

1. Set the GPU

2. Import data

3. View data

2. Data preprocessing

1. Load data

2. Visualize data

3. Check the data again

4. Configure the data set

Three, residual network (ResNet) introduction

1. What does the residual network solve

2. ResNet – 50 is introduced

Iv. Build the resnet-50 network model

Five, compile,

6. Training model

Vii. Model evaluation

Save and load the model

Nine, forecasting

Related Posts

How can machine learning predict bond yields

Install TensorFlow based on Docker

Sensetime completes $620 million C+ round of financing, valued at $4.5 billion!