This article has been added🚀 Python AI planEverything you need is there, from a Python nerd to an AI godhere.


  • Akik: Student K
  • data: public number (K classmate) inside replyDL+29You can get data
  • Code: all the code has been put in the article, you can also go to my GitHub download

The data included 1,462 data images, which were divided into nine categories, including buoys, cruise ships, ferries, cargo ships, gondolas, inflatable boats, kayaks, paper boats and sailboats. We will use ResNet50 algorithm to recognize these 9 types of targets, and the final accuracy is 87.0%.

🥇 students who need project customization and graduation guidance can add our v. letter: mtyjkh_

My environment:

  • Language: Python3.8
  • Compiler: Jupyter Lab
  • Deep learning environment: TensorFlow2.4.1

Our code flow chart looks like this:

1. Set GPU

import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")

if gpus:
    gpu0 = gpus[0] # If there are multiple Gpus, use only the 0th GPU
    tf.config.experimental.set_memory_growth(gpu0, True) Set GPU memory usage as required
    tf.config.set_visible_devices([gpu0],"GPU")
    
import matplotlib.pyplot as plt
import os,PIL,pathlib
import numpy as np
import pandas as pd
import warnings
from tensorflow import keras

warnings.filterwarnings("ignore")# Ignore warning messages
plt.rcParams['font.sans-serif'] = ['SimHei']  # used to display Chinese labels normally
plt.rcParams['axes.unicode_minus'] = False  # is used to display the minus sign normally
Copy the code

2. Import data

1. Import data

import pathlib

data_dir = "./29-data/"
data_dir = pathlib.Path(data_dir)
image_count = len(list(data_dir.glob('* / *')))
print("Total number of pictures is:",image_count)
Copy the code
Total number of images: 1462Copy the code
batch_size = 16
img_height = 224
img_width  = 224
Copy the code
"" Image_dataset_from_directory () can be described in the following article: https://mtyjkh.blog.csdn.net/article/details/117018789 by importing data, this method can also disrupt data "" "
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.1,
    subset="training",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code
Found 1462 files belonging to 9 classes.
Using 1316 files for training.
Copy the code
"" Image_dataset_from_directory () can be described in the following article: https://mtyjkh.blog.csdn.net/article/details/117018789 by importing data, this method can also disrupt data "" "
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.1,
    subset="validation",
    seed=12,
    image_size=(img_height, img_width),
    batch_size=batch_size)
Copy the code
Found 1462 files belonging to 9 classes.
Using 146 files for validation.
Copy the code
class_names = train_ds.class_names
print("Data categories are:",class_names)
print("There are category % D ships to be identified."%len(class_names))
Copy the code
The data categories are:  ['buoy', 'cruise ship', 'ferry boat', 'freight boat', 'gondola', 'inflatable boat', 'kayak', 'paper boat', There are nine types of boats that need to be identifiedCopy the code

2. Check the data

for image_batch, labels_batch in train_ds:
    print(image_batch.shape)
    print(labels_batch.shape)
    break
Copy the code
(16, 224, 224, 3)
(16,)
Copy the code

3. Configure the data set

  • Shuffle () : shuffles data.
  • Prefetch () : Prefetch data and accelerate operation. You can refer to my previous two articles for detailed introduction.
  • Cache () : Cache data sets into memory to speed up operations
AUTOTUNE = tf.data.AUTOTUNE

def train_preprocessing(image,label) :
    return (image/255.0,label)

train_ds = (
    train_ds.cache()
    .map(train_preprocessing)    The preprocessor function can be set here
    .prefetch(buffer_size=AUTOTUNE)
)

val_ds = (
    val_ds.cache()
    .map(train_preprocessing)    The preprocessor function can be set here
    .prefetch(buffer_size=AUTOTUNE)
)
Copy the code

4. Data visualization

plt.figure(figsize=(10.8))  The width of the graph is 10 and the height is 5
plt.suptitle("Data presentation")

for images, labels in train_ds.take(1) :for i in range(15):
        plt.subplot(4.5, i + 1)
        plt.xticks([])
        plt.yticks([])
        plt.grid(False)

        # display images
        plt.imshow(images[i])
        # display tag
        plt.xlabel(class_names[labels[i]-1])

plt.show()
Copy the code

Third, build the model

In the process of this training, I found an interesting phenomenon: when I use complex network, the training effect is not very ideal; When the network is relatively simple, the results are fine.

from tensorflow.keras import layers, models, Input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten, Dropout,BatchNormalization,Activation

Load the pretraining model
base_model = keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(img_width,img_height,3))

for layer in base_model.layers:
    layer.trainable = True
    
# Add layers at the end
X = base_model.output
X = Flatten()(X)

X = Dense(512, kernel_initializer='he_uniform')(X)
X = Dropout(0.5)(X)
X = BatchNormalization()(X)
X = Activation('relu')(X)

X = Dense(16, kernel_initializer='he_uniform')(X)
X = Dropout(0.5)(X)
X = BatchNormalization()(X)
X = Activation('relu')(X)

output = Dense(len(class_names), activation='softmax')(X)

model = Model(inputs=base_model.input, outputs=output)
Copy the code

Four, compile,

optimizer = tf.keras.optimizers.Adam(lr=1e-4)

model.compile(optimizer=optimizer,
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])
Copy the code

5. Training model

from tensorflow.keras.callbacks import ModelCheckpoint, Callback, EarlyStopping, ReduceLROnPlateau, LearningRateScheduler

NO_EPOCHS = 50
PATIENCE  = 5
VERBOSE   = 1

# Set dynamic learning rate
# annealer = LearningRateScheduler(lambda x: 1e-3 * 0.99 ** (x+NO_EPOCHS)) # annealer = LearningRateScheduler(lambda x: 1e-3 * 0.99 ** (x+NO_EPOCHS))

# Set early stop
earlystopper = EarlyStopping(monitor='loss', patience=PATIENCE, verbose=VERBOSE)

# 
checkpointer = ModelCheckpoint('best_model.h5',
                                monitor='val_accuracy',
                                verbose=VERBOSE,
                                save_best_only=True,
                                save_weights_only=True)
Copy the code
train_model  = model.fit(train_ds,
                  epochs=NO_EPOCHS,
                  verbose=1,
                  validation_data=val_ds,
                  callbacks=[earlystopper, checkpointer])
Copy the code
Epoch 1/50 83/83 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 17 109 ms/s step - loss: 1.8596 accuracy: 0.3625 - val_loss: 2.2435-val_accuracy: 0.3699 Epoch 00001: Val_accuracy Improved from -INF to 0.36986, saving model to best_model.h5 Epoch 2/50 83/83 [==============================] - 8s 94ms/step - loss: 1.5476-accuracy: 0.5190-val_loss: 2.1825-val_accuracy: 0.1575...... Epoch 00049: 0.90411 Epoch val_accuracy did not improve the from 50/50 83/83 [= = = = = = = = = = = = = = = = = = = = = = = = = = = = = =] - 7 s 81 ms/step - loss: 0.4809-accuracy: 0.9111-val_loss: 0.5607-val_accuracy: 0.8699 Epoch 00050: Val_accuracy did not improve from 0.90411Copy the code

Vi. Evaluation model

1. Accuracy and Loss chart

acc = train_model.history['accuracy']
val_acc = train_model.history['val_accuracy']

loss = train_model.history['loss']
val_loss = train_model.history['val_loss']

epochs_range = range(len(acc))

plt.figure(figsize=(12.4))
plt.subplot(1.2.1)

plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1.2.2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()
Copy the code

It can be seen that the fluctuation of the model is quite large, which is mainly caused by the small amount of data (1462 images and 9 categories). After the data expansion, the situation will be effectively improved.

2. Confusion matrix

from sklearn.metrics import confusion_matrix
import seaborn as sns
import pandas as pd

Define a function that plots the obfuscation matrix
def plot_cm(labels, predictions) :
    
    Generate an obfuscation matrix
    conf_numpy = confusion_matrix(labels, predictions)
    Convert the matrix to a DataFrame
    conf_df = pd.DataFrame(conf_numpy, index=class_names ,columns=class_names)  
    
    plt.figure(figsize=(8.7))
    
    sns.heatmap(conf_df, annot=True, fmt="d", cmap="BuPu")
    
    plt.title('Confusion matrix',fontsize=15)
    plt.ylabel('True value',fontsize=14)
    plt.xlabel('Predicted value',fontsize=14)
Copy the code
val_pre   = []
val_label = []

for images, labels in val_ds:Here we can take part of the validation data (.take(1)) to generate the confusion matrix
    for image, label in zip(images, labels):
        Need to add a dimension to the image
        img_array = tf.expand_dims(image, 0) 
        # Use models to predict people in pictures
        prediction = model.predict(img_array)

        val_pre.append(class_names[np.argmax(prediction)])
        val_label.append(class_names[label])
Copy the code
plot_cm(val_label, val_pre)
Copy the code

3. Evaluation of indicators

from sklearn import metrics

def test_accuracy_report(model) :
    print(metrics.classification_report(val_label, val_pre, target_names=class_names)) 
    score = model.evaluate(val_ds, verbose=0)
    print('Loss function: %s, accuracy:' % score[0], score[1])
    
test_accuracy_report(model)
Copy the code
Precision Recall F1-Score Support buoy 1.00 0.67 0.80 3 Cruise ship 0.86 0.82 0.84 22 Ferry boat 1.00 0.50 0.67 6 Freight boat 0.00 0.00 2 gondola 0.91 1.00 0.96 32 inflatable boat 0.00 0.00 0.00 2 kayak 0.76 0.86 0.81 22 paper Boat 1.00 0.33 0.50 3 Sailboat 0.88 0.96 0.92 54 accuracy 0.87 146 macro AVG 0.71 0.57 0.61 146 weighted AVG 0.85 0.87 0.85 146 Loss Function: 0.5606985688209534, accuracy: 0.8698630332946777Copy the code